Hacker News new | past | comments | ask | show | jobs | submit login

whisper.cpp supports a model with "speaker segmentation" or "local diarization". It is called "local" because that it doesn't name the distinct speakers; it only tells you when the speaker changes. See https://github.com/ggerganov/whisper.cpp/issues/1715#issueco.... Once you compile whisper.cpp and download the model, run `main` with that model and the option `-tdrz true`.



Thank you. This is exactly it, perfect. I will now try and detect the speaker, I guess for podcasts where one side asks questions and the other tends to respond it might be easier, but perhaps not!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: