I think they are taken over by exactly the same people leading the AI-hype. Funn...

danieldk · 2025-03-24T07:56:49 1742803009

solving a small subset of problems in a way noone asked for

What do you mean? Having ROCm fused MoE and MLA kernels as a counterpart to kernels for CUDA is very useful. AMD needs to provide this if they want to keep AMD accelerators competitive with new models.

fock · 2025-03-24T12:27:22 1742819242

should the matrix-multiplication at the core of this not be in a core library? Why are generic layers intermixed with LLM-specific kernels when the generic layers are duplicating functionality in torch?

Upstreaming that might actually help researchers doing new stuff vs. the narrow demographic of people speeding LLMs on MI300X's.

imtringued · 2025-03-24T09:04:38 1742807078

They are imitating Nvidia's TensorRT with AITER. Basically AMD wants to have "CUDA, but not CUDA".

tdullien · 2025-03-24T12:40:09 1742820009

They'd like to have CUDA, period, but are legally barred from it.

almostgotcaught · 2025-03-24T14:58:17 1742828297

> They are imitating Nvidia's TensorRT

Do you know what the RT in TensorRT stands for? hint: AITER has nothing to do with TensorRT.

fc417fc802 · 2025-03-24T07:53:17 1742802797

> I think most people just want ROCm to work at all

I think most people don't want to have to think about vendor lock-in related bullshit. Most people just want their model to run on whatever hardware they happen to have available, don't want to have to worry about whether or not future hardware purchases will be compatible, and don't want to have to rewrite everything in a different framework.

Most people fundamentally don't care about ROCm or CUDA or OneAPI or whatever else beyond a means to an end.