I tried the 27b-iat model on a 4090m with 16gb vram with mostly default args via...

		abawany 5 months ago \| parent \| context \| favorite \| on: Gemma 3 QAT Models: Bringing AI to Consumer GPUs I tried the 27b-iat model on a 4090m with 16gb vram with mostly default args via llama.cpp and it didn't fit - used up the vram and tried to use about 2gb of system ram: performance in this setup was < 5 tps.