> Please tell me your config! I have an i9-10900 with 32GB of ram that only gets...

smiley1437 · on June 6, 2023

The model I have is q4_0 I think that's 4 bit quantized

I'm running in Windows using koboldcpp, maybe it's faster in Linux?

brucethemoose2 · on June 6, 2023

I am running linux with cublast offload, and I am using the new 3 bit quant that was just pulled in a day or two ago.

smiley1437 · on June 6, 2023

Thanks! I'll have to try the 3bit to see if that helps

LoganDark · on June 7, 2023

cuBLAS or CLBlast? There is no such thing as cublast

LoganDark · on June 6, 2023

> The model I have is q4_0 I think that's 4 bit quantized

That's correct, yeah. Q4_0 should be the smallest and fastest quantized model.

> I'm running in Windows using koboldcpp, maybe it's faster in Linux?

Possibly. You could try using WSL to test—I think both WSL1 and WSL2 are faster than Windows (but WSL1 should be faster than WSL2).

smiley1437 · on June 6, 2023

I didn't know what WSL was, but now I do, thanks for the tip!