For reference, here is an example of the end product of a (in my opinion) very good transcription service: https://www.grc.com/sn/sn-645.txt
I do not know how much Elaine (who makes those transcriptions) costs, but of course it's going to be more than Amazon Transcribe's cost. What I wonder is, what would be the cost to take the output of Amazon Transcribe, and fix/format it into something like what I linked?
Regardless, one thing that (I think) is important, is that having at least the output of Amazon Transcribe (or something similar) will make it easier for people to find something that was said in your podcast. At least, until Google/etc. start transcribing audio they find, and indexing that…
Only problem is he didn’t go over really the quality of the transcription. Transcription by robot is still AWFUL a human can generally do wayyy better.
it’s hilarious how bad it is in this age of AI we supposedly live in.
It's not even a question of AI. In the past epoch, when we used to own (rather than rent/subscribe to) software, it was common to use desktop apps such as Dragon Naturally Speaking. They didn't pretend they could do a great job out of the box - the result would be just like you get with Google, Amazon and the rest. In order to get good results, you needed to train it first. It didn't last long, but after that it produced exactly what you said (with exceptions like less usual proper names etc.). I wouldn't ever accept the quality of all these "modern" transcription APIs, they look like a step backward to me.
Haven't have a chance to test AWS Transcribe yet but I run a benchmark of IBM, Microsoft and Google transcription api weekly.
So far, IBM watson work best for english, google in other . language. For example the transcript of trump inauguration speech (16minutes 38seconds, 8371 character) give me a levenshtein distance to the official one of (smaller is better) :
I tried to use this, but it looks like they have to review me for access.
I've tried the IBM and Google transcription APIs on a podcast with overlapping speakers, and they turned out dadaist poetry. I've been meaning to check out other solutions.
Click the "Episode ___" link in the first paragraph, then search for "Direct mp3 download" on that page.
I'd love to see more details on what you do. I haven't found a good automated solution yet. It seems like the solutions with acceptable quality involve Mechanical Turk.
I do not know how much Elaine (who makes those transcriptions) costs, but of course it's going to be more than Amazon Transcribe's cost. What I wonder is, what would be the cost to take the output of Amazon Transcribe, and fix/format it into something like what I linked?
Regardless, one thing that (I think) is important, is that having at least the output of Amazon Transcribe (or something similar) will make it easier for people to find something that was said in your podcast. At least, until Google/etc. start transcribing audio they find, and indexing that…