
Show HN: Using 3D Convolutional Neural Networks for Speaker Verification - irsina
https://github.com/astorfi/3D-convolutional-speaker-recognition
======
irsina
We leveraged 3D convolutional architecture for creating the speaker model in
order to simultaneously capturing the speech-related and temporal information
from the speakers' utterances.

------
woodson
Have you compared your system to state of the art results on NIST Speaker
Recognition Evaluations? I'm asking because the datasets released for these
evaluations every two years or so are what have been driving progress in the
field for close to two decades, and your dataset is completely unknown in the
speaker recognition literature. Not trying to discredit your work, but that's
THE obvious data you would use to evaluate anything speaker recognition
related and not doing that just raises some flags.

~~~
irsina
Thank you so much for your feedback ... You are absolutely correct ...
Usually, for publishing, people use any dataset, for example in Google they
usually use their own huge dataset for the speaker-dependent scenario and not
the NIST. But I accept the comparison using NIST dataset could showcase the
work in a better way. Unfortunately, I am not working on that project anymore
so I have no time for doing that but I am open to collaboration for this
effort if anyone is interested to continue.

------
skndr
Any hints about future improvements for open set speaker diarization?

~~~
irsina
Deep learning is going into that but when? That's obviously another question.
Lol!

