More

ij23 · 2024-10-12T18:42:13 1728758533

OpenAI's multi-agent framework swarm only supports models from OpenAI.

OpenSwarm uses LiteLLM to add support for any LLM AnthropicAI, MistralAI, Ollama, Huggingface, GroqInc, Replicate

ij23 · 2024-10-12T09:11:24 1728724284

Canary is awesome! we use Canary for our doc search at LiteLLM (you can see it here: https://docs.litellm.ai/docs/)

It's really useful to be able to specify the search space for a specific query (example: Canary allows search for the query "sagemaker" on our docs or on our github issues )

metabeard · 2024-10-13T22:03:50 1728857030

The search modal says, "Search by Algolia".

yujonglee · 2024-10-13T22:11:48 1728857508

click cute yellow bird next to the searchbar.

ij23 · on Dec 14, 2023

hi i'm the maintainer of litellm - we persist rate limits, they're written to a DB: https://docs.litellm.ai/docs/proxy/virtual_keys

- LiteLLM Proxy IS Exactly Compatible with the OpenAI SDK

ij23 · on Dec 1, 2023

I'm the LiteLLM maintainer, can you elaborate what you're looking for us to do here?

ij23 · on Aug 16, 2023

Ran some testing and discovered llama2 on replicate is faster than chatgpt!

Code - https://github.com/BerriAI/litellm/blob/main/cookbook/Evalua...

Are others seeing similar results?

ij23 · on Aug 12, 2023

What local/in-K8-cluster models servers would you recommend adding ?

Should we add support for llama.cpp and vllm.ai in the proxy server ? Or should we assume you can host them on your own infra and the proxy server requests your hosted model ?

kiratp · on Aug 12, 2023

IMO don’t try to be the one stop shop to host models. There are too many players with all sorts of advancements (eg: stopping grammar, continuous batching, novel quantization etc.) and you won’t be able to keep up.

There is a ton of boilerplate around the actual model server that’s just busy work , but if done wrong can be a huge performance suck. Solve that.

Build the proxy that works with the most model servers out there. Do it in a way that once you have mindshare, the model server makers will be find it easy to put up a PR so that they can claim your proxy supports their server.

Don’t take a hard dependency on non-OSS stuff - being able to build an “on-prem” solution (read “deployed into customer’s VPC”) is table stakes for anyone to use your offering for a lot of enterprise use cases.

Edit: another unsolved problem - different models need slightly different prompts to solve the same problem well…

bredren · on Aug 12, 2023

If it makes sense to expand scope to provide a particular model server and the group can easily be the best st it, I say go for it. But do it as a separate (but perhaps connected) project to this.

But in general I’m in agreement that this sounds like a separate concept than any given model server.

That said, where is a list of model servers for the most commonly wanted LLMs at this point?

Perhaps maintaining a list of those that do and don’t work with the proxy would be helpful.

detente18 · on Aug 12, 2023

Hey bredren - our supported list (if that's helpful) is here https://litellm.readthedocs.io/en/latest/supported/

We're adding new integrations every day, so if there's any specific one you'd like to add feel free to let us know (discord/ticket/email/etc.) - here's my email: krrish@berri.ai

ij23 · on Aug 12, 2023

Yes, you use your own API keys. You can set them as env variables. Either set them as os.environ['OPENAI_API_KEY'] or set them in .env files: https://litellm.readthedocs.io/en/latest/supported/

ij23 · on July 27, 2023

thanks for sharing, while your library looks really powerful my goal with Litellm is simplicity

ij23 · on July 27, 2023

good points, probably going to add streaming output, function calling support. As for retries tenacity does a great job already

ij23 · on July 27, 2023

azure models have custom names - eg I call mine 'chat-gpt-test1', I require some flag to know if it's an azure model