Hacker News new | past | comments | ask | show | jobs | submit login

You can’t (practically) unit test LLM responses, at least not in the traditional sense. Instead, you do runtime validation with a technique called “LLM as judge.”

This involves having another prompt, and possibly another model, evaluate the quality of the first response. Then you write your code to try again in a loop and raise an alert if it keeps failing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: