As usual, no AMD GPU support mentioned. What a sad state of affair, I regret goi...

jmorgan · on Feb 17, 2024

AMD GPU support is definitely an important part of the project roadmap (sorry this isn't better published in a ROADMAP.md or similar for the project – will do that soon).

A few of the maintainers of the project are from the Toronto area, the original home of ATI technologies [1], and so we personally want to see Ollama work well on AMD GPUs :).

One of the test machines we use to work on AMD support for Ollama is running a Radeon RX 7900XT, and it's quite fast. Definitely comparable to a high-end GeForce 40 series GPU.

[1]: https://en.wikipedia.org/wiki/ATI_Technologies

FirmwareBurner · on Feb 17, 2024

What about AMD APUs with RDNA graphics? ANy chance of getting Olama for them?

spookie · on Feb 18, 2024

I suppose it comes down to ROCm support. https://docs.amd.com/en/docs-5.7.1/release/windows_support.h...

freedomben · on Feb 17, 2024

Same. I really want AMD to succeed because as a long time Linux user I have strong distaste for Nvidia and the hell they put me through. I paid a lot for a beastly AMD card in the hopes that it would be shortly behind Nvidia and that has most definitely not been the case, and I blame AMD for not putting the resources behind it.

AMD, you can change, but you need to start NOW.

mchiang · on Feb 17, 2024

Hi, we’ve been working to support AMD GPUs directly via ROCm. It’s still under development but if you build from source it does work:

https://github.com/ollama/ollama/blob/main/docs/development....

Filligree · on Feb 17, 2024

Every time I try to run anything through ROCm, my machine kernel-panics.

I’m not blaming you for this, but I’m also sticking with nvidia.

mchiang · on Feb 17, 2024

Really sorry about this. Do you happen to have logs for us to look into? This is definitely not the way we want

Filligree · on Feb 17, 2024

To be clearer, it isn't Ollama-specific. I first encountered the issue with Stable Diffusion, and it's remained since, but the GPU that causes it isn't currently inside any machine; I replaced it with a 3090 a few days ago.

weebull · on Feb 18, 2024

I'd recommend trying stuff that exhausts the VRAM. That seems to be where thinks get flakey for me (RX 7600 - 8GB), especially if running a desktop too.

superkuh · on Feb 18, 2024

And you're the lucky one getting the chance to kernel panic with ROCm. AMD drops ROCm support for their consumer GPUs so fast it'll make your head spin. I bought my GPU for $230 in 2020 and by 2021 AMD had dropped support for it. Just a bit under 4 years after the card's release on market.

agartner · on Feb 17, 2024

Working well for me on a 7900XT with ROCm 6 and Linux 6.7.5 thanks!

antman · on Feb 18, 2024

What is the speedup vs cpu?

zare_st · on Feb 18, 2024

Curious how different a long time FreeBSD user feels. I have a strong distaste for anything not nvidia.

Official nvidia drivers have been added to FreeBSD repository 21 years ago. I can't count the number of different types of drivers used for ATi/AMD in these two decades. And none had the performance or stability.

visarga · on Feb 17, 2024

Ollama is a model-management app that runs on top of llama.cpp so you should ask there about AMD support.

progman32 · on Feb 17, 2024

I've been running llama.cpp with full GPU acceleration on my AMD card, using the text-generation-webui install script on kubuntu. Same with stable diffusion using a1111. AMD's compute stack is indeed quite broken and is more fragile, but it does work using most modern cards.

The kernel panics though... Yeah, I had those on my Radeon vii before I upgraded.

65a · on Feb 17, 2024

llama.cpp has had ROCm support for a long time

michaelmrose · on Feb 17, 2024

What problems have you had with AMD and in what fashion do they fall short of Nvidia?

freedomben · on Feb 17, 2024

I've had no end of difficulty installing the Pro drivers and/or ROCm. The "solution" that was recommended was to install a different distro (I use Fedora and installing CentOS or Ubuntu was recommended). When I finally could get it installed, I got kernel panics and my system frequently became unbootable. Then once it was installed, getting user space programs to recognize it was the next major pain point.

michaelmrose · on Feb 17, 2024

I've been using Nvidia and it stopped being challenging in about 2006. I hear perpetually that Nvidia is horrible and I should try AMD. The 2 times I did admitted a long time ago it was... not great.

freedomben · on Feb 17, 2024

Do you use Ubuntu LTS? If so, then indeed Nvidia is not a problem.

But if you run a distro that has anywhere near new kernels such as Fedora and Arch, you'll be constantly in fear of receiving new kernel updates. And every so often the packages will be broken and you'll have to use Nvidia's horrible installer. Oh and every once in a while they'll subtly drop support for older cards and you'll need to move to the legacy package, but the way you'll find out is that your system suddenly doesn't boot and you just happen to think about it being the old Nvidia card so you Kagi that and discover the change.

65a · on Feb 17, 2024

I found it much easier to make ROCm/AMD work for AI (including on an laptop) than getting nvidia work with Xorg on an optimus laptop with an intel iGPU/nvidia dGPU. I swore off nvidia at that point.

michaelmrose · on Feb 17, 2024

Changing kernels automatically as new releases came out was never an optimal strategy even if its what you get by default in Arch. Notably arch has linux-lts presently at 6.6 whereas mainline is 6.7.

Instead of treating it like a dice roll and living in existential dread at the entirely predictable peril of Linus cutting releases that necessarily occasionally front run NVIDIA which releases less frequently I simply don't install kernels first released yesterday, pull in major kernel version updates daily, don't remove the old kernel automatically when the new one is installed, and automatically make snapshots on update against any sort of issue that might obtain.

If that seems like too much work one could simply at least keep the prior kernel version around and reboot and your only out 45 seconds of your life. This actually seems like a good idea no matter what.

I don't think I have used nvidia's installer since 2003 on Fedora "Core"–as the nomenclature used to be—One. One simply doesn't need to. Also generally speaking one doesn't need to use a legacy package until a card is over 10 years old. For instance the oldest consumer card unsupported right now is a 600 series from 2012.

If you still own a 2012 GPU you should probably put it where it belongs in the trash but when you get to the sort of computers that require legacy support which is 2009-2012 you are apt to need to worry about other matters like distros that still support 32 bit, simple environments like xfce, software that works well in ram constrained environments. Needing to install a slightly different driver seems tractable.

spookie · on Feb 18, 2024

Try to use the runfile provided by Nvidia and use DKMS. The biggest issue is just that flatpaks aren't really updated for CUDA drivers, but you can just not use them if your distro isn't old or niche.

slavik81 · on Feb 18, 2024

On Fedora 40, I believe you can install llama.cpp's ROCm dependencies with:

    dnf install hipcc rocm-hip-devel rocblas-devel hipblas-devel

slavik81 · on Feb 18, 2024

So, after a bit of experimentation, it seems that Fedora is built primarily for RDNA 3 while Debian is built for RDNA 2 and earlier. These are llama-cpp build instructions for Fedora: https://gist.github.com/cgmb/bb661fccaf041d3649f9a90560826eb.... These are llama-cpp build instructions for Debian: https://gist.github.com/cgmb/be113c04cd740425f637aa33c3e4ea3....

freedomben · on Feb 19, 2024

Great, thanks I will give this a try once I upgrade!

karolist · on Feb 18, 2024

What hell specifically, do you mean loading binary blob drivers in the past?

peppermint_gum · on Feb 17, 2024

AMD clearly believes that this newfangled "GPU compute" fad will pass soon, so there's no point to invest in it.

This is one of the worst acts of self-sabotage I have ever seen in the tech business.

jart · on Feb 17, 2024

Zen4 AVX512 must be really good then.

imtringued · on Feb 18, 2024

To be fair a lot of the GPU edge comes from fast memory. A GPU with 20tflops running a 30 billion parameter model has a compute budget of 700flops per parameter. Meanwhile the sheer size of the model prevents you from loading it more than 20 times from memory per second.

jart · on Feb 17, 2024

llamafile has amd gpu support. on windows, it only depends on the graphics driver, thanks to our tinyBLAS library.

https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.6.2

By default it opens a browser tab with a chat gui. You can run it as a cli chatbot like ollama as follows:

https://justine.lol/oneliners/#chat

antman · on Feb 18, 2024

Which amd cpus are supported by tinyBLAS/llamafile?

lostmsu · on Feb 22, 2024

This is about GPU

chown · on Feb 17, 2024

As others have mentioned, Ollama uses Llama.CPP under the hood and they recently released Vulkan support which is supposed to work with AMD GPUs. I was able to use llama.cpu compiled with Vulkan support with my app [1] and make it run on an AMD laptop but I was unable to make it work with Ollama as it makes some assumptions about how it goes about searching for available GPUs on a machine.

[1]: https://msty.app

Kelteseth · on Feb 17, 2024

I got a Windows defender Virus alert after executing your app.

chown · on Feb 17, 2024

Ugh! Probably because it’s an exe app? Not sure how to go around about that. I am looking into getting it signed just like the counterpart MacOS app. Thank you for the heads up and sorry about the false positive.

rezonant · on Feb 18, 2024

Ironically Ollama is also struggling with this sort of thing, see https://github.com/ollama/ollama/issues/2519

Code signing helps by having an avenue by which you can establish reliable reputation, and then using VirusTotal to check for AV flags and using the AV vendor's whitelist request form is the second part, over time your reputation increases and you don't get flagged as malware.

It seems to be much more likely with AI stuff, apparently due to use of CUDA or something (/shrug)

65a · on Feb 17, 2024

ROCm is preferred over vulkan for AMD GPUs, performance wise. Using OpenCL or Vulkan should only be for older cards or weird setups.

chown · on Feb 17, 2024

That’s good to know. Thank you!

accelbred · on Feb 17, 2024

Ollama has a opencl backend. I'm on Linux and clblast works great with AMD cards. As far as I remember opencl on Windows did not have that much issues, but its been a while.

gerwim · on Feb 17, 2024

Maybe there’s proper support soon in AI landscape [0].

[0]: https://news.ycombinator.com/item?id=39344815

RealStickman_ · on Feb 17, 2024

I've had success using my AMD GPU with the OpenCL backend for llamacpp. The ROCm backend had pretty bad performance though.

Klaster_1 · on Feb 20, 2024

Can't edit the parent post, here's the issue to track the Windows ROCM support in ollama: https://github.com/ollama/ollama/issues/2598

vdaea · on Feb 17, 2024

AMD is the underdog, and that's what happens when you choose the underdog.

Dalewyn · on Feb 18, 2024

I would argue we are well past the point of calling AMD an underdog.