“Do politics have artifacts?” was the rejoinder article. IMO that article should be as widely read as the main one, because it provides a warning to those who take the main one as gospel. Link: https://journals.sagepub.com/doi/abs/10.1177/030631299029003...
It's curious that Anthropic is entering the LLMOps tooling space ---this definitely comes as a surprise to me, as both OpenAI and HuggingFace seem to avoid building prompt engineering tooling themselves. Is this a business strategy of Anthropic's? An experiment? Regardless, it's cool to see a company like them throw their hat into the LLMOps space beyond being a model provider. Interested to see what comes next.
It's curious that Anthropic is entering the LLMOps tooling space ---this definitely comes as a surprise to me, as both OpenAI and HuggingFace seem to avoid building prompt engineering tooling themselves. Is this a business strategy of Anthropic's? An experiment? Regardless, it's very cool to see a company like them throw their hat into the LLMOps space beyond being a model provider. Interested to see what comes next.
ChainForge lets you do this, and also setup ad-hoc evaluations with code, LLM scorers, etc. It also shows model responses side-by-side for the same prompt: https://github.com/ianarawjo/ChainForge
There is a long term vision of supporting fine-tuning through an existing evaluation flow. We originally created this because we were worried about how to evaluate ‘what changed’ between a fine-tuned LLM and its base model. I wonder if Vertex AI has an API that we could plug-in, though, or if it’s limited to the UI.
Hey Eric! Thank you! As an aside, we are looking to interview some people who’ve used ChainForge (you see, we are academics who must justify our creations through publications… crazy, I know). Would you or anyone on your team be interested in a brief chat?
Thank you for the kind words! Looking at the photo, I think you wouldn’t need the last prompt node there.
As far as evaluating functions go, that’s unfortunately a ways off. But, we generally prioritize things based on how many people posted GitHub Issues about it/want it. (For instance, Chat Turn nodes came from an Issue.) If you post a feature request there, it’ll move up our priority list, and we can also clarify what the feature precisely should be.