Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Using 3D Convolutional Neural Networks for Speaker Verification (github.com/astorfi)
59 points by irsina on June 25, 2017 | hide | past | favorite | 5 comments


We leveraged 3D convolutional architecture for creating the speaker model in order to simultaneously capturing the speech-related and temporal information from the speakers' utterances.


Have you compared your system to state of the art results on NIST Speaker Recognition Evaluations? I'm asking because the datasets released for these evaluations every two years or so are what have been driving progress in the field for close to two decades, and your dataset is completely unknown in the speaker recognition literature. Not trying to discredit your work, but that's THE obvious data you would use to evaluate anything speaker recognition related and not doing that just raises some flags.


Thank you so much for your feedback ... You are absolutely correct ... Usually, for publishing, people use any dataset, for example in Google they usually use their own huge dataset for the speaker-dependent scenario and not the NIST. But I accept the comparison using NIST dataset could showcase the work in a better way. Unfortunately, I am not working on that project anymore so I have no time for doing that but I am open to collaboration for this effort if anyone is interested to continue.


Any hints about future improvements for open set speaker diarization?


Deep learning is going into that but when? That's obviously another question. Lol!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: