Running an ARC 770 Gpu on my linux workstation. After Nvidia messed up wayland support so badly, the switch is easy. Fully open source drivers. Everything is smooth as butter and event Pytorch, Blender etc are (experimentally) supported. Pretty great if you ask me.
What specifically is your issue with ROCm & HIP that makes it a joke? It has actually worked great for me on both Blender and Pytorch (Stable-Diffusion) with Debian on my 6700XT.
Extremely limited support for one. Why is it locked behind the 6000 series cards? The project has been around since the 500 series cards days and they utterly ignored the entire market for years.
What exactly is locked to 6000 series cards? I can't speak specifically to having had any old 500 series, but I was using even more ancient 400 series cards (rx480 from 2016) with ROCm for years up until recently. So I am not sure what what issue you would have with 500 series other than probably being old slow hardware, but I don't see why they wouldn't work otherwise¹.
AMD doesn't support any recent consumer GPUs, and they have long-standing support issues failing to build which are sometimes only resolved a year or more after launch.
While AMD might not have "official" support for their consumer hardware lineup on offerings that have traditionally been a non-consumer area of GPGPU, there is plenty of evidence that it will indeed still work just fine on most of them (once compiled properly and sometimes with the right environment flags); as witnessed by me and even people in the link you reference. I agree that "official" support would be great eventually, but for me as long as it works, that's adequate enough for me on consumer hardware.
I think the thread I linked shows there is substantial frustration here, and the reason I didn't purchase a Navi card was in part due to widespread reports that ROCm was untenable on the platform.
> once compiled properly and sometimes with the right environment flags
Not interested in apologism for the corporation. If AMD wants to compete with CUDA, they have to support the consumer GPUs that are accessible to hobbyists and non-experts. This is user hostile support.
Yes I mean everything beta and such. But fully open source and what I saw from oneAPI they are doing all the right things recently. Let's hope that Gelsinger is to Intel what Nadella is to Microsoft.
Historically Intel has the best Linux open-source drivers, with OpenCL and OpenGL support (unlike AMD ROCM which requires a custom kernel and closed-source components, and the fully closed Nvidia stack).
Unfortunately for ARC they pretty much marketed it solely for gaming applications, where the chip and drivers don't really excel in. Linux support, AV1 encoding, and so on are its strong points, not raw performance. You can argue that those things don't make as much money as the gaming market, but if Intel's beaten by the competition on the performance side they'll need to find another route to sell this.
Raw performance is not there yet but the "Max" gpus seem to be quite powerful and on par with Nvidia (on paper). Blender and also gaming (via Proton) works quite well under Linux too already which is great considering the drivers are not even fully merged into mainline yet.
Not sure what you are talking about as far as AMD?
This statement in particular:
> "unlike AMD ROCM which requires a custom kernel and closed-source components"
As far as my experience, ROCm does not require a custom kernel nor special kernel modules as it works just fine for me with stock distro kernels, nor does it require closed-source components. OpenCL and OpenGL have pretty much always worked fine for a long time now. I think that AMD drivers are just as good as Intel ones especially when it comes to how open-source they are. While it's cool that Intel already has AV1 encoding on their new cards, I fully expect AMD will have just as good AV1 encoding on their next gen RDNA 3 cards too when they come out soon.
In Fedora, the 37 release (released about two weeks ago) is the first time ever I got OpenCL working on Vega, via rocm-opencl. Until 37, rocm was not packaged at all, and the upstream packages support only Ubuntu and CentOS.
Even with 37, rocm is not complete. HIP is missing, blender doesn't work and asks for the proprietary driver (which, again, works only with Ubuntu and CentOS).
I think you have gotten to the crux of it, distro package maintainers seem to have not done a particularly good job in many distros when it comes to AMD, but that is hardly AMD's fault as they have done a fairly good job at running their own full binaries repo¹ for various package formats. I also would recommend therefore that you change to AMD's repo as it definitely contains the HIP runtime. It would appear that you just need to add https://repo.radeon.com/rocm/yum/rpm to your package manager for Fedora. For ROCm OpenCL, my only pointer is that you have to make sure that you are adding the dynamic library path, for example:
I am pretty sure that you should be able to resolve both OpenCL and Blender+HIP support given your configuration details given so far and I hope you figure it out.
Did some research into this and it looks like ROCM is much easier to set up now as you mention. I remember in the past it required a lot of changes to the stock system to get working, but I haven't touched ROCM in a long time.
Your comment suggests ignorance on three fronts. First, you're suggesting that because Wayland is a "super-process" of Xorg, it has no value or purpose. This is not accurate, as Wayland may offer certain advantages over Xorg, such as improved performance or greater security.
Second, you're implying that because Wayland is a newer display server, it lacks the software support that Xorg has. This is not the case, as many popular Linux applications and desktop environments now support Wayland (Firefox, GNOME, Gimp, LibreOffice) and more are being added all the time.
Third, you're suggesting that Wayland is built on top of Xorg and is not distinct from it. Wayland and Xorg are two different display servers, that have two completely different architectures.
Also, there are many reasons why parent would want to use Wayland: it's supposedly more secure, and lighter to run.
Chrome, Firefox, Emacs. (And because of Chrome you get the long tail of electron apps.) It's quite rare that I use an xwayland applications anymore.
Getting multiple monitors of different resolutions to display content nicely is literally impossible on X, so I've been very happy with the improvements since moving to Wayland.
I have a dual monitor setup, with one 1440 and a ultra wide 2560x1080, and I don't have any issues with X11. It just works out of box. Something that I can't say of Wayland crashing every time that I try it.
Different scaling ratios in X11 do not work - period. You can define different ratios per monitor, but application windows will take the scale of the monitor they open on and never change.
Most distros are now on Wayland by default for a reason.
Gnome support is excellent.
Sway support is excellent.
KDE support is... not as excellent, but still pretty good at this point.
Ubuntu 21.04+ is on Wayland by default.
Fedora 35+ is running Wayland by default (minus KDE, where they default to x11 still, but ship a plasma-wayland session).
OpenSUSE Leap 15+ is running Wayland by default.
Debian Buster+ is running Wayland by default.
Garuda is running Wayland by default.
Manjaro Sway & Gnome are Wayland by default (again KDE is not - but the plasma-wayland session is included).
Basically - Wayland is not "the future" anymore. It's the "Right fucking now" across basically every major distro. Even KDE based distros are discussing moving to plasma-wayland by default at this point, since it's improved a lot in the last two years.
Which mostly just works as long as you have Pipewire and xdg-desktop-portal-kde installed (the base plasma-wayland session usually includes them)
This one is a bit less polished - some users still have problems with keyboard input, depending on the distro and other installed packages.
For Sway:
xdg-desktop-portal-wlr works just fine for screen sharing, and you can use https://github.com/any1/wayvnc for VNC access (including having a completely headless machine).
It's... not X... why would it do X forwarding? that's like criticising windows for not doing X forwarding (there's enough stuff to criticise windows about as it is, there's no need to add imaginary things to it).
I think what's really meant here is different pixel densities -- e.g. a high-dpi laptop screen and a low-resolution external monitor. I'm not sure exactly how Wayland handles this, but X doesn't do a very good job.
Correct, I should have said a mixture of HighDPI screens, and non high-dpi screens rather than mentioning resolution. Thanks for the clarification.
What I used to have on X was one model with 2x the pixel density as another. You either have things being tiny on one, and massive on the other or vice versa. What I did for a while was configure everything to be halfway between them so everything would be just _slightly_ off on both, but it was kind of horrible (and made the fonts look terrible).
Chrome does default to x11, but it is easily changeable via chrome://flags.
With other electron apps, it depends. Some are easy, some are not. The notably difficult one is vscode, which handles its args itself, ignores (electron|chromium)-flags.conf and processes the cmd line flags only after the electron framework is initialized (thus the ozone flags having zero effect).
As vetinari says (you can also put chrome config in ~/.config/chrome-flags.conf).
He also pointed out quite rightly, that it very much depends on the version of electron that's used and how it's used, so it's something of a crapshoot for whether they'll support things nicely.
I have to use Xorg from time to time because Wayland on Ubuntu 22.04 (Intune only supports LTS) is pretty broken when it comes to screen sharing. This is why someone would prefer Wayland:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7183 jono 20 0 25.9g 187448 103628 R 52.0 0.6 41:44.25 Xorg
I've uninstalled X and XWayland from my primary laptop and don't miss it at all! Zero tearing is beautiful, and PipeWire has completed the equation very nicely, so all the usual suspects that used to be problems (e.g. screensharing) work seamlessly.
on their desktop application? I use their web version which does support it. Either way, their desktop version always used to say that screen sharing on my platform is unsupported.
I thought they used some gnome only screenshot api which doesn't work very well?
Thanks! I wasn't aware. I needed to set a couple of environment variables, and it's still broken, but it shows something. (very strange artefacts, looks like it's capturing garbage.).
Depending on the game, Intel will switch between their native driver for DX9 and using DXVK to run DX9 stuff via modern Vulcan instead. The driver will decide.
They do since 3950 and newer builds. That's why 3953 provides almost double the fps in games where they use the native ones (such as CS:Go), not to mention a lot better frametimes.
badass. way to cut loose an old anchor. i love this trend: build well for the new thing, use open-source compatibility layers to make the old not-good microsoft apis work.
that said, intel arc has been slow as dirt for old games. they seem to have, through hard work, turned the corner. im curious how much of that improvement has gotten upstream to the public, to others. versus how much intel has done only for themselves, after putting all their eggs in this open source community project.
I imagine the problem is lack of game specific optimisations in their drivers. In any other field, hacking optimisations in to spot and optimise an algorithm that is used as a benchmark for the hardware would be considered close to fraud, but for the graphics crowd it's all in a days work.
It comes down to the fact that in comparing GPUs from a general consumer perspective, unlike traditional benchmarking the goal isn't to look at the technical details of "how much time do these take to execute this exact same algorithm" but "how much time do these take to produce these visually indistinguishable images", since the end user doesn't really care if the algorithm being used is the same, they just want the same output to be produced faster.
I remember in 2008 or thereabouts there was some controversy because one vendor rendered some things at a higher quality than the other. It was a relatively small thing like the edges of alpha textures or texture filtering or something. There was a period of time where some video card reviews would include a visual quality comparison because there could be noticable differences between the two.
There were still various 'hard' fixed function components in GPUs which would cause that kind of visual difference back in the day.
Technically this does sort of still happen these days (eg the period of time when NVIDIA cards had ray tracing support but AMD cards did not, the quality of the video encoder, or nowadays with the various super-resolution methods used by each vendor), in which case those do tend to get treated separately in third-party benchmarks since not everyone really cares for the quality of those.
They don't do this only in benchmarks, they also do it in actual games too!
Hearing that NVIDIA engineers actually re-implemented many of the un-optimized shaders released on triple-A games and made their drivers to dynamically swap to their shader implmentation on the fly, is a testament to how NVIDIA really did care about game driver performance.
If DirectX were an open source standard we would have had almost no problem writing translation layers between all the versions and other APIs. Unfortunately, since it's a reverse engineering hack job, there's still too much lost performance left on the table
So far, backward compatibility for D3D games on Windows has been excellent though. I really wonder why Intel is the only GPU vendor who has trouble supporting older D3D versions (and if this is such a problem for GPU vendors, why Microsoft doesn't step in and provide a generic D3D9-on-D3D12 driver).
(also I think the actual problem isn't a generic D3D9 emulation, but that a good emulation needs to implement thousands of game-specific performance and compatibility hacks which are usually taken care of by the GPU vendor or Microsoft)
Intel were apparently using that for backwards compatibility on Arc[1]. I guess it wasn't good enough and they're using DXVK instead? Or is this new "native" mode DXVK instead?
They may be using both of those. They mention in the video linked from the article that the driver is choosing between native and the hybrid approach where appropriate, depending on what game/app is running. Maybe they also choose between different hybrid options as well?
That's what I gathered. Since D3D9On12 compatibility is supposed to be complete and supported by Microsoft themselves, and DXVK compatibility is hit or miss but much higher performance, they're whitelisting popular titles that they can validate for DXVK on Windows. It's the right approach.
There's zero sense in writing a DX9 driver today. Not with the performance you can get out of DXVK. Most of the remaining popular games on DX9 will be moved to native Vulkan or DX12 at a point regardless.
Some of this seem to Arc trying to be "legacy free" and not including much in Arc drivers / hardware to specifically cater for legacy APIs. AMD and Nvidia drivers are almost certainly eldritch abominations full of legacy. Intel also seem to have at least partly started anew on the drivers as well, although I doubt gaming performance was a major aim for Intel iGPU drivers before Arc. They probably though that Microsoft have an emulation layer, so that will be good enough, right? I bet the Microsoft team were probably more thinking "we want casual games like The Sims to work on ARM netbooks with a modern API only GPU, and nobodies going to be using those sort of things for maximum performance 110% stuff...".
As stated by others in the comments section, DX9 is a huge, complicated beast. nvidia and AMD have decades of experience here, so dragging it along with new generations of GPUs is relatively easy. Intel would've had to start almost from scratch. They're betting on the future, where backwards compat with it becomes less and less important, and the power of their GPUs keeps increasing, so the overhead for older titles is getting less noticeable.
Intels integrated GPUs have pretty good "legacy D3D" support though (much better than their legacy GL support anyway)... it's not like they are newcomers when it comes to graphics driver development. Also D3D11 support will be relevant at least for the next 10 years
I forget where but in one YouTube video at least where they were being interviewed they explained that they thought this too. They thought that their integrated graphics legacy would be able to be essentially transferred over to making desktop cards but it sadly wasn't the case. The things that they did for the drivers for the integrated chips and the way that they designed the integrated chips were not transferable knowledge. So they did have to start mostly from the ground floor to work on the Arc gpus.
Well, on Vista+ versions of Windows the support of DX8- is generally pretty bad... and those early 3D games are too computationally intensive to be fully emulated !
Is there? Many games actually run faster on Linux via Proton than on Windows with the same hardware. I think the performance overhead of translation layers are pretty overstated in 2022
They are, and it's because they're able to multithread more in DXVK and return the results faster than in most native DX9 implementations.
Translation layers for a currently-intensive API like DX12 or Vulkan would be worth the criticism, and as developers most of us prefer as close to the metal as possible.
Eventually even Nvidia and AMD will drop their native DX9 drivers. It's driver bloat, and requires some level of maintenance for new GPU architectures. I expect all vendors to use D3D9On12 at a point.
It's like if our hardware directly supported DOS applications. We wouldn't need DOSBox for our Adlib emulation, but everyone agrees that DOXBox is the best approach.
The biggest DX9 titles, Starcraft 2 and CSGo, will be migrated to DX12 and Vulkan (respectively) if they continue to remain popular. CSGo itself has experimental Vulkan support on Linux already.
I'm quite sure Steam still displays the SteamOS+Linux badge, I wonder if perhaps you're thinking of the Steam Deck certified badge? That does include games which run through Proton, it just means that the game runs properly.
Can anyone explain to me the performance difference between having a native OpenGL or DX9/10/11 userspace driver vs having a Vulkan/DX12 userspace driver and using Zink or DXVK to implement the older APIs?
From my understanding, Vulkan/DX12 are APIs that are supposedly very low-level and allow fine-grain control over things like memory access, synchronization, and hardware-level optimizations (usually through the use of extensions). For OpenGL and DX9/10/11, these are basically higher-level abstractions that can be implemented on top of the lower-level APIs.
Assuming the above is accurate, what advantage do you get by having "native" OpenGL or DX9/10/11 implementations on your GPU? Is there anything Intel is missing out on by using DXVK as opposed to implementing their own implementations?
And how does Gallium3D (from the Mesa project) fit in all of this for Linux? Is that a lower-level abstraction than Vulkan?
> Assuming the above is accurate, what advantage do you get by having "native" OpenGL or DX9/10/11 implementations on your GPU? Is there anything Intel is missing out on by using DXVK as opposed to implementing their own implementations?
One key issue is that Vulkan makes assumptions about the structure of graphics pipelines, which can make it harder to optimize the driver when an application changes a piece of state. For example, if an application changes the face culling mode, this may require recompiling the entire pipeline, which can be inefficient.
In contrast, native OpenGL or DirectX implementations allow for more flexible and efficient ways of managing state changes. For instance, they may provide hardware toggles that can be used to quickly and easily update the state without needing to recompile the entire pipeline. This can significantly improve the performance of the driver.
They're adding new VK_EXT_extended_dynamic_state extensions to allow setting state dynamically, but the extensions don't support every single random piece of hardware state, and some hardware state can be limited or weird enough that it's hard to export in any sort of generic interface. Try explaining to a user that dynamic state A only works if there's enough free internal memory that hasn't been filled up by state B + C or D + A.
Also any abstraction layer will add CPU overhead because you need to manage/allocate more objects, process state through more layers, etc. It doesn't matter in every piece of code, but processing draw calls (which can happen millions of times a second) it can add up.
> Gallium3D
Is a lower layer below OpenGL. It was originally intended to be at a similar level to D3D9, so OpenGL drivers could avoid a lot of state tracking (e.g. texture dimensions are immutable open creation). It's still higher-level than Vulkan, since memory management and synchronization implicit (at least last I checked).
Hm. If Gallium3D is in some way higher level than Vulkan, but it also allows higher performance high-level API implementations (as I believe it does), maybe that is because it's internal to Mesa so drivers can punch holes in its abstractions as needed?
> The third option, surprisingly, is Gallium3D (or simply "Gallium"), he said. He has been learning that it is basically a hybrid approach between state streaming and pre-baked pipelines. It uses "constant state objects" (CSOs), which are immutable objects that capture a part of the GPU state and can be cached across multiple draw operations. CSOs are essentially a Vulkan pipeline that has been chopped up into pieces that can be mixed and matched as needed. Things like the blending state, rasterization mode, viewport, and shader would each have their own CSO. The driver can associate the actual GPU commands needed to achieve that state with the CSO.
It is probably because Gallium3D was designed specifically to implement higher level APIs like OpenGL and D3D, so they made sure that there was an efficient mapping.
Vulkan is growing extensions specifically for that though, so in the medium term it might catch up.
One of the reasons graphics drivers are huged on windows is because since there was no way to optimize precisely with old APIs like dx9 and gl, drivers would often « recognize » the game engine running and switch to a fast path dedicated to it. That’s why there were updates so often and why you would sometimes even see « improved performance for game X » in the changelog. I guess that’s exactly what is happening here. DXVK is the « generic » driver, while their implementation is fast path for specific games.
I wonder if this tradition of game-specific driver patches has even changed with the modern APIs though. If yes we would eventually see less frequent driver updates (which doesn't seem to be the case so far).
My understanding is yes and no. In the pre DX12 era drivers often contained full replacement shaders for games sometimes. Where the OEM would hand write a replacement that worked better than what the game dev did, potentially on a GPU sku level. They would also potentially completely re-write how sections of the pipeline worked with shims for the same reasons. This was sustainable because shaders were relatively tiny (look at the limitations of DX9 shaders vs DX10+), and the pipeline relatively simple. However moving forwards to DX10+ and shader size explodes, as does pipeline complexity, and shader count. While they still did this for popular titles by special agreement, they largely switched to issuing guidance and providing tools to devs to optimize their games. But the issue persists. Writing shaders for Nvidia's warp system will potentially compromise performance on AMD, Intel etc. So OEMs still do some shims for games when performance is particularly egregious. But with DX12 and Vulkan in particular it's more about having presets for things like you'd find in NVidia control panel and a few game specific optimizations to critical paths, instead of just replacing how that game interacts with the driver completely.
I think we're seeing something related instead. AMD/nVidia have 20+ years of optimizing for specific games in their drivers. Intel doesn't. So they're using DXVK to make up for all the games they don't have the time/resources to optimize for. If this works out for Intel long-term, it would be a huge boon for all of us. It means any future competitors would be able to use DXVK to make up for the disadvantage of joining the game so late.
I think it's just bugfixes now. Instead of updating drivers, nVidia and AMD now work directly with developers to integrate the optimizations into their engines.
I think it has to be stressed that Vulkan and Direct3D 12 are still abstractions, and still do not provide direct hardware access. In some cases, their more explicit nature allows you to implement older APIs on top of them without any performance issues, but there are places where the abstractions make incompatible assumptions and it is very hard to get as good performance as a native driver can.
does this include a miniport driver for regular gdi acceleration? I recently switched from a low end radeon to an arc770 and older software like beyondcompare and vs2015 became noticeably choppy
Does Windows 10 still uses GDI interface (afaik some fancy bitblit stuff from the olden S3 days) - surely that is deprecated by now and everythingis running on top of the GPU ???.
Even GDI runs on the GPU[1] it just is slower because bitmaps are stored in system ram vs VRAM. IIRC there are tricks a driver can use to make it faster by default[2].
The (admittedly very dumb) issue I as a developer have switching from MFC/ATL to newer kits is that it makes it very hard to ship all in one apps without bloating them by shipping the control toolkit every single time. I realize this is a dumb complaint, but when I can get something down to ~3.5MB which I consider a win for size squishing in modern apps. But does it make sense for all cases? No, if I was writing something people need to use on more than an occasional basis I'd probably grab a toolkit that did the work for me to use D2D (which btw operates on top of either GDI or DX11, not DX12).
How is the job market for MFC/ATL? Mostly maintenance/legacy work?
My first "hello world" Windows app was built with MFC, and although I hate it, my heart has a soft spot for it because of that.