Hey all! I've been developing a prompt engineering interface that helps users query LLMs with parametrized prompts and compare responses across models. It's an early demo, but already we've used it internally in my academic lab to evaluate and choose prompts for other research projects that involve building LLM applications. Let me know what you think, or if you encounter any bugs or issues.
Blog post here: https://ianarawjo.medium.com/introducing-chainforge-a-visual...