So this thread's podcast of 52 minutes of a complex technical topic with multiple speakers could cost ~$200. A programming-related podcast is already a niche topic with a tiny audience and an Array Languages podcast is an even tinier subset of that so the cost might not be justified.
I suppose podcasts could be uploaded to Youtube and let their speech-to-text algorithm do an auto-transcribe. However, the A.I. algorithm is not good at tech topics with industry jargon/acronyms and the resultant transcription will be inaccurate.
I make transcripts of all my work using Descript. It uses Google's speech-to-text algo (same as the one in youtube presumably) and gives you a transcript you can then edit. It costs $15/month I believe, and you have to spend some time editing the transcript that realistically won't be read by many, but it works pretty well ime (no affiliation besides being a happy customer)
Right. A high-quality podcast is already lots of pre- and post-production work on just the audio. I use Rev which hires captioners on my behalf [0] but it's also expensive. I use it sparingly.
Chrome now provides on-device powered live captions (which hooks into any chrome originating audio) - chrome://settings/accessibility -> toggle "Live Captions"[1] which could help alleviate some of the limitations for audio impaired viewers
>Chrome now provides on-device powered live captions [...] which could help alleviate some of the limitations for audio impaired viewers
That's a great feature! But it also highlights the limited accuracy of the AI machine learning algorithm for technical topics with jargon. E.g., at 27m00s, the caption algorithm incorrectly transcribes it as as "APL is joked about as a right only language" -- but we know the speaker actually said, "APL is joked about as a write-only language". And the algorithm incorrectly transcribes "oversonian languages" when it's actually "Iversonian languages".
The algorithm also doesn't differentiate multiple speakers and the generated text is just continuously concatenated even as the voices change. Therefore, an audio-impaired wouldn't know which person said a particular string of words.
This is why podcasters still have to pay humans (sometimes with domain knowledge) to carefully listen to the audio and accurately transcribe it.
Samsung's bastard version of Android had a similar "Automated Subtitles" feature. It's decent for watching videos with the phone on silent, but it's pretty crap when there are lots of proper nouns and unusual jargon, as I imagine this podcast has.