> How to utilize multiple NVIDIA GPUs? | Tabby only supports the use of a single...

wsxiaoys · 2025-01-13T00:49:39 1736729379

> So using 2 NVLinked GPU's with inference is not supported?

To make better use of multiple GPUs, we suggest employing a dedicated backend for serving the model. Please refer to https://tabby.tabbyml.com/docs/references/models-http-api/vl... for an example

SOLAR_FIELDS · 2025-01-13T03:08:50 1736737730

I see. So this is like, I can have tabby be my LLM server with this limitation or I can just turn that feature off and point tabby at my self hosted LLM as any other OpenAI compatible endpoint?

wsxiaoys · 2025-01-13T03:31:30 1736739090

Yes - however, the FIM model requires careful configuration to properly set the prompt template.