I’m thinking of continuous evaluation for LLM in production, where after each call, a webhook will send the input/output to evaluate.
I’m thinking of continuous evaluation for LLM in production, where after each call, a webhook will send the input/output to evaluate.