Hacker News new | past | comments | ask | show | jobs | submit login

As usual, no AMD GPU support mentioned. What a sad state of affair, I regret going with AMD this time.



AMD GPU support is definitely an important part of the project roadmap (sorry this isn't better published in a ROADMAP.md or similar for the project – will do that soon).

A few of the maintainers of the project are from the Toronto area, the original home of ATI technologies [1], and so we personally want to see Ollama work well on AMD GPUs :).

One of the test machines we use to work on AMD support for Ollama is running a Radeon RX 7900XT, and it's quite fast. Definitely comparable to a high-end GeForce 40 series GPU.

[1]: https://en.wikipedia.org/wiki/ATI_Technologies


What about AMD APUs with RDNA graphics? ANy chance of getting Olama for them?



Same. I really want AMD to succeed because as a long time Linux user I have strong distaste for Nvidia and the hell they put me through. I paid a lot for a beastly AMD card in the hopes that it would be shortly behind Nvidia and that has most definitely not been the case, and I blame AMD for not putting the resources behind it.

AMD, you can change, but you need to start NOW.


Hi, we’ve been working to support AMD GPUs directly via ROCm. It’s still under development but if you build from source it does work:

https://github.com/ollama/ollama/blob/main/docs/development....


Every time I try to run anything through ROCm, my machine kernel-panics.

I’m not blaming you for this, but I’m also sticking with nvidia.


Really sorry about this. Do you happen to have logs for us to look into? This is definitely not the way we want


To be clearer, it isn't Ollama-specific. I first encountered the issue with Stable Diffusion, and it's remained since, but the GPU that causes it isn't currently inside any machine; I replaced it with a 3090 a few days ago.


I'd recommend trying stuff that exhausts the VRAM. That seems to be where thinks get flakey for me (RX 7600 - 8GB), especially if running a desktop too.


And you're the lucky one getting the chance to kernel panic with ROCm. AMD drops ROCm support for their consumer GPUs so fast it'll make your head spin. I bought my GPU for $230 in 2020 and by 2021 AMD had dropped support for it. Just a bit under 4 years after the card's release on market.


Working well for me on a 7900XT with ROCm 6 and Linux 6.7.5 thanks!


What is the speedup vs cpu?


Curious how different a long time FreeBSD user feels. I have a strong distaste for anything not nvidia.

Official nvidia drivers have been added to FreeBSD repository 21 years ago. I can't count the number of different types of drivers used for ATi/AMD in these two decades. And none had the performance or stability.


Ollama is a model-management app that runs on top of llama.cpp so you should ask there about AMD support.


I've been running llama.cpp with full GPU acceleration on my AMD card, using the text-generation-webui install script on kubuntu. Same with stable diffusion using a1111. AMD's compute stack is indeed quite broken and is more fragile, but it does work using most modern cards.

The kernel panics though... Yeah, I had those on my Radeon vii before I upgraded.


llama.cpp has had ROCm support for a long time


What problems have you had with AMD and in what fashion do they fall short of Nvidia?


I've had no end of difficulty installing the Pro drivers and/or ROCm. The "solution" that was recommended was to install a different distro (I use Fedora and installing CentOS or Ubuntu was recommended). When I finally could get it installed, I got kernel panics and my system frequently became unbootable. Then once it was installed, getting user space programs to recognize it was the next major pain point.


I've been using Nvidia and it stopped being challenging in about 2006. I hear perpetually that Nvidia is horrible and I should try AMD. The 2 times I did admitted a long time ago it was... not great.


Do you use Ubuntu LTS? If so, then indeed Nvidia is not a problem.

But if you run a distro that has anywhere near new kernels such as Fedora and Arch, you'll be constantly in fear of receiving new kernel updates. And every so often the packages will be broken and you'll have to use Nvidia's horrible installer. Oh and every once in a while they'll subtly drop support for older cards and you'll need to move to the legacy package, but the way you'll find out is that your system suddenly doesn't boot and you just happen to think about it being the old Nvidia card so you Kagi that and discover the change.


I found it much easier to make ROCm/AMD work for AI (including on an laptop) than getting nvidia work with Xorg on an optimus laptop with an intel iGPU/nvidia dGPU. I swore off nvidia at that point.


Changing kernels automatically as new releases came out was never an optimal strategy even if its what you get by default in Arch. Notably arch has linux-lts presently at 6.6 whereas mainline is 6.7.

Instead of treating it like a dice roll and living in existential dread at the entirely predictable peril of Linus cutting releases that necessarily occasionally front run NVIDIA which releases less frequently I simply don't install kernels first released yesterday, pull in major kernel version updates daily, don't remove the old kernel automatically when the new one is installed, and automatically make snapshots on update against any sort of issue that might obtain.

If that seems like too much work one could simply at least keep the prior kernel version around and reboot and your only out 45 seconds of your life. This actually seems like a good idea no matter what.

I don't think I have used nvidia's installer since 2003 on Fedora "Core"–as the nomenclature used to be—One. One simply doesn't need to. Also generally speaking one doesn't need to use a legacy package until a card is over 10 years old. For instance the oldest consumer card unsupported right now is a 600 series from 2012.

If you still own a 2012 GPU you should probably put it where it belongs in the trash but when you get to the sort of computers that require legacy support which is 2009-2012 you are apt to need to worry about other matters like distros that still support 32 bit, simple environments like xfce, software that works well in ram constrained environments. Needing to install a slightly different driver seems tractable.


Try to use the runfile provided by Nvidia and use DKMS. The biggest issue is just that flatpaks aren't really updated for CUDA drivers, but you can just not use them if your distro isn't old or niche.


On Fedora 40, I believe you can install llama.cpp's ROCm dependencies with:

    dnf install hipcc rocm-hip-devel rocblas-devel hipblas-devel


So, after a bit of experimentation, it seems that Fedora is built primarily for RDNA 3 while Debian is built for RDNA 2 and earlier. These are llama-cpp build instructions for Fedora: https://gist.github.com/cgmb/bb661fccaf041d3649f9a90560826eb.... These are llama-cpp build instructions for Debian: https://gist.github.com/cgmb/be113c04cd740425f637aa33c3e4ea3....


Great, thanks I will give this a try once I upgrade!


What hell specifically, do you mean loading binary blob drivers in the past?


AMD clearly believes that this newfangled "GPU compute" fad will pass soon, so there's no point to invest in it.

This is one of the worst acts of self-sabotage I have ever seen in the tech business.


Zen4 AVX512 must be really good then.


To be fair a lot of the GPU edge comes from fast memory. A GPU with 20tflops running a 30 billion parameter model has a compute budget of 700flops per parameter. Meanwhile the sheer size of the model prevents you from loading it more than 20 times from memory per second.


llamafile has amd gpu support. on windows, it only depends on the graphics driver, thanks to our tinyBLAS library.

https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.6.2

By default it opens a browser tab with a chat gui. You can run it as a cli chatbot like ollama as follows:

https://justine.lol/oneliners/#chat


Which amd cpus are supported by tinyBLAS/llamafile?


This is about GPU


As others have mentioned, Ollama uses Llama.CPP under the hood and they recently released Vulkan support which is supposed to work with AMD GPUs. I was able to use llama.cpu compiled with Vulkan support with my app [1] and make it run on an AMD laptop but I was unable to make it work with Ollama as it makes some assumptions about how it goes about searching for available GPUs on a machine.

[1]: https://msty.app


I got a Windows defender Virus alert after executing your app.


Ugh! Probably because it’s an exe app? Not sure how to go around about that. I am looking into getting it signed just like the counterpart MacOS app. Thank you for the heads up and sorry about the false positive.


Ironically Ollama is also struggling with this sort of thing, see https://github.com/ollama/ollama/issues/2519

Code signing helps by having an avenue by which you can establish reliable reputation, and then using VirusTotal to check for AV flags and using the AV vendor's whitelist request form is the second part, over time your reputation increases and you don't get flagged as malware.

It seems to be much more likely with AI stuff, apparently due to use of CUDA or something (/shrug)


ROCm is preferred over vulkan for AMD GPUs, performance wise. Using OpenCL or Vulkan should only be for older cards or weird setups.


That’s good to know. Thank you!


Ollama has a opencl backend. I'm on Linux and clblast works great with AMD cards. As far as I remember opencl on Windows did not have that much issues, but its been a while.


Maybe there’s proper support soon in AI landscape [0].

[0]: https://news.ycombinator.com/item?id=39344815


I've had success using my AMD GPU with the OpenCL backend for llamacpp. The ROCm backend had pretty bad performance though.


Can't edit the parent post, here's the issue to track the Windows ROCM support in ollama: https://github.com/ollama/ollama/issues/2598


AMD is the underdog, and that's what happens when you choose the underdog.


I would argue we are well past the point of calling AMD an underdog.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: