Thanks! Those are all good questions. Let me respond to them one by one,
> Can I just use arch for routing between LLMs
Yes, you can use arch_config.yaml file to select between LLMs. In fact we have a demo on llm_routing [1] that you can try. Here how you can specify different LLMs in our config,
We currently support mistral and openai. And for both of them we support streaming interface. We do expose openai complaint v1/chat interface so any chat UI that works with openai should work with us as well. We do ship demos with gradio sample application.
> And what about key management? Do I manage access keys myself?
None of your clients need to manage access keys. Upon receipt of request our filter will appropriate LLM from arch_config and pick relevant access_key and modify request with access_key from arch_config before sending request to upstream LLM [2].
> Can I just use arch for routing between LLMs
Yes, you can use arch_config.yaml file to select between LLMs. In fact we have a demo on llm_routing [1] that you can try. Here how you can specify different LLMs in our config,
> And what LLMs do you supportWe currently support mistral and openai. And for both of them we support streaming interface. We do expose openai complaint v1/chat interface so any chat UI that works with openai should work with us as well. We do ship demos with gradio sample application.
> And what about key management? Do I manage access keys myself?
None of your clients need to manage access keys. Upon receipt of request our filter will appropriate LLM from arch_config and pick relevant access_key and modify request with access_key from arch_config before sending request to upstream LLM [2].
[1] https://github.com/katanemo/archgw/tree/main/demos/llm_routi...
[2] https://github.com/katanemo/archgw/blob/main/crates/llm_gate...