Show HN: An open-source ELO benchmark for voice agents

cryogenicplanet · 2024-05-14T03:04:40

want to thank the op for sharing this; i threw this together in the last couple of days ramping out to the "steamroll" - we think one of the key problems in LLMs in general but esp voice is evals and wanted to have a good place to evaluate voice-to-voice systems. these systems can be end-to-end like openai or (asr+llm)->tts or asr->(llm+tts) or asr->llm->tts

we built an ELO benchmark very much in the style of LMSYS and will be releasing results every two weeks

source code here: https://github.com/thevoicecompany/bench.audio

will be adding proper contributing guide soon