I would recommend you to try to run the llama-based distill (same size, same quantization) that you can find here: https://huggingface.co/bartowski/DeepSeek-R1-Distill-Llama-8...
It should take the same amount of memory as the one you currently have.
In my experience the Llama version performs much better at adhering to the prompt, understanding data in multiple languages, and going in-depth in its responses.
I would recommend you to try to run the llama-based distill (same size, same quantization) that you can find here: https://huggingface.co/bartowski/DeepSeek-R1-Distill-Llama-8...
It should take the same amount of memory as the one you currently have.
In my experience the Llama version performs much better at adhering to the prompt, understanding data in multiple languages, and going in-depth in its responses.