
Ask HN: What to do with 500M call recordings? - urupvog
We have around 500 million call recordings with average duration of 1 minute in English&#x2F;hindi and other languages spoken in India. Just wondering, what can we do with this huge dataset? What type of models can we create?
======
muzani
I'd probably run it through some kind of sentiment analysis, try to get it to
model as happiness/satisfaction on y-axis, vs time on x-axis.

Then you can map that to conversations and see what words increases or
decreases satisfication. Look for sudden changes in the happiness contour.

You can also try to map that to customer surveys at the end of the call - see
if you can improve perceived quality, by say, greeting them cheerfully early
on, or if different phrases will diffuse anger better. Maybe even see if you
can spot weird patterns, like if certain accents trigger anger or contempt.

------
sharemywin
can they be transcribe via a free speech to text library?

Can you find a existing transcription service that you can up sell to clients?

use hat service to refine you dataaset

then you can build your own model. and cut out the service.

assuming you have it organized by client you could map it to industry. and
then build an industry classifier.

run it through sentiment analysis, translation.

------
Spooky23
Analyze them for quality purposes to improve whatever it is that you're
supposed to be doing.

------
mtmail
Did the people consent to the recordings?

~~~
urupvog
Yes, all of these are business to consumer calls.

~~~
mtmail
I'd be interested what the consent looks like. Did users agree to recording
"for quality purposes" or specifically for to-be-defined-in-the-future data
modeling?

~~~
urupvog
Ofcourse, for quality purposes.

~~~
AnimalMuppet
Then you're kind of restricted by the consent they gave. They didn't give
their consent for you to use them for any purpose that you could come up with;
they gave their consent for _very specific_ purposes.

