Hacker News new | past | comments | ask | show | jobs | submit login

Hi, I built the client UI for this and... yea, I really wanted to get Firefox working :(

We needed a way to measure voice-to-voice latency from the end-user's perspective, and found Silero voice activity detection (https://github.com/snakers4/silero-vad) to be the most reliable at detecting when the user has stopped speaking, so we can start the timer (and stop it again when audio is received from the bot.)

Silero runs via onnx-runtime (with wasm). Whilst it sort-of-kinda works in Firefox, the VAD seems to misfire more than it should, causing the latency numbers to be somewhat absurd. I really want to get it working though! I'm still trying.

The code for the UI VAD is here: https://github.com/pipecat-ai/web-client-ui/tree/main/src/va...






Do you know why there's a difference in the performance of the algorithm in another browser? I would expect that all browsers run the code exactly the same way.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: