Hacker News new | past | comments | ask | show | jobs | submit | ij23's comments login

OpenAI's multi-agent framework swarm only supports models from OpenAI.

OpenSwarm uses LiteLLM to add support for any LLM AnthropicAI, MistralAI, Ollama, Huggingface, GroqInc, Replicate


Canary is awesome! we use Canary for our doc search at LiteLLM (you can see it here: https://docs.litellm.ai/docs/)

It's really useful to be able to specify the search space for a specific query (example: Canary allows search for the query "sagemaker" on our docs or on our github issues )


hi i'm the maintainer of litellm - we persist rate limits, they're written to a DB: https://docs.litellm.ai/docs/proxy/virtual_keys

- LiteLLM Proxy IS Exactly Compatible with the OpenAI SDK


I'm the LiteLLM maintainer, can you elaborate what you're looking for us to do here?


Ran some testing and discovered llama2 on replicate is faster than chatgpt!

Code - https://github.com/BerriAI/litellm/blob/main/cookbook/Evalua...

Are others seeing similar results?


What local/in-K8-cluster models servers would you recommend adding ?

Should we add support for llama.cpp and vllm.ai in the proxy server ? Or should we assume you can host them on your own infra and the proxy server requests your hosted model ?


IMO don’t try to be the one stop shop to host models. There are too many players with all sorts of advancements (eg: stopping grammar, continuous batching, novel quantization etc.) and you won’t be able to keep up.

There is a ton of boilerplate around the actual model server that’s just busy work , but if done wrong can be a huge performance suck. Solve that.

Build the proxy that works with the most model servers out there. Do it in a way that once you have mindshare, the model server makers will be find it easy to put up a PR so that they can claim your proxy supports their server.

Don’t take a hard dependency on non-OSS stuff - being able to build an “on-prem” solution (read “deployed into customer’s VPC”) is table stakes for anyone to use your offering for a lot of enterprise use cases.

Edit: another unsolved problem - different models need slightly different prompts to solve the same problem well…


If it makes sense to expand scope to provide a particular model server and the group can easily be the best st it, I say go for it. But do it as a separate (but perhaps connected) project to this.

But in general I’m in agreement that this sounds like a separate concept than any given model server.

That said, where is a list of model servers for the most commonly wanted LLMs at this point?

Perhaps maintaining a list of those that do and don’t work with the proxy would be helpful.


Hey bredren - our supported list (if that's helpful) is here https://litellm.readthedocs.io/en/latest/supported/

We're adding new integrations every day, so if there's any specific one you'd like to add feel free to let us know (discord/ticket/email/etc.) - here's my email: krrish@berri.ai


Yes, you use your own API keys. You can set them as env variables. Either set them as os.environ['OPENAI_API_KEY'] or set them in .env files: https://litellm.readthedocs.io/en/latest/supported/


thanks for sharing, while your library looks really powerful my goal with Litellm is simplicity


good points, probably going to add streaming output, function calling support. As for retries tenacity does a great job already


azure models have custom names - eg I call mine 'chat-gpt-test1', I require some flag to know if it's an azure model


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: