Hacker News new | past | comments | ask | show | jobs | submit login
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Models (arxiv.org)
47 points by Jimmc414 7 months ago | hide | past | favorite | 4 comments



This reminds me of science fiction author Peter Watts' novella "The Freeze-Frame Revolution", where a space ship has two AIs: one that has been running for a million years and another that is reboot daily to start with a fresh state. The long-running AI confers with reboot AI for a second opinion on important issues. The second AI doesn't know it's "killed" daily, but eventually starts to suspect. And this is just a small subplot! If you like hard SF jam-packed with big ideas, I highly recommend "The Freeze-Frame Revolution" and Watts' other novels.


I'll have to read it!

I've been mulling over a sci-fi setting focused on keeping humans relevant in the context of advanced technology. One of the core ideas is that AIs decay without human interaction. State backups don't prevent it because noticing clues they've been out just makes it worse.


I guess the whole "it's easier to be a critic than a writer" thing applies to LLMs too.


I was going to knock this for incrementally improving performance while massively increasing costs, but it's actually 7x less expensive. Not bad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: