
Your next conference should have real-time captioning - oskarth
https://lkuper.github.io/blog/2014/05/31/your-next-conference-should-have-real-time-captioning/
======
jamesbrownuhh
It's hard to over-state the power and value of having a text transcript of any
kind of spoken material like this. It literally unlocks a world of
possibilities - not just for search and video indexing, both massively useful
in themselves, but also to make such talks genuinely digitally accessible, in
such a way that they can be read and interpreted by people who cannot hear, or
a worldwide audience of people who don't necessarily have English as a first
or second language.

Caption all the things. There are so many benefits that it's just daft not to.

~~~
eknkc
My native language is not English, had to learn it myself on written material.
I could read and write just fine, but no way understand spoken English. I'd
sometimes start watching some tech video, or listen to a talk, try to catch a
few words here and there. Captioning would be a blessing back then.

Actually it was. I discovered American TV shows on the internet (well.. had no
legal way to access them) Started watching first season 24 with subtitles,
they were like training wheels for ~20 eposiodes. Then I managed to ditch them
and started listening (with a finger on a shortcut to jump 10 seconds back).

So, please have text transcripts if possible. They make a huge difference for
non English speakers.

~~~
rdtsc
Oh yeah I've learned a lot of my English from TV show with subtitles. Some
countries (Russia) always dub English shows and moves while others like
Romania pretty much never dub and most always just write subtitles. I think
that account for %-wise more Romanian young people speaking better English.

~~~
pearkes
I was recently in Romania and was blown away at how casually people spoke
English! A slightly American accent, with pacing and rhythm similar to a
native speaker.

Contrasting that to German English speakers whom likely had high quality
English classes throughout education speak with a specific accent and rhythm
closer to that of the German language.

I don't know if their are technical reasons for this, but anecdotally
Romanians had said they learned a lot of their English from watching dubbed TV
and movies.

Someone else in this thread mentioned the show 24. It's fascinating that,
indirectly, watching pirated copies of a typically Hollywood entertainment
television show (which academic and intellectual communities may find "crude")
can open up a wealth of knowledge and culture (English language only
resources) for people whom, potentially, wouldn't have had the educational
opportunities otherwise.

It makes my spine tingle when stories like this flip your understanding and
perspective of things like the Hollywood Entertainment Machine. Things have
value in unexpected and fascinating ways.

~~~
madeofpalk
In China, you can buy the scripts to Friends with the Chinease translation
printed below as a learning resource.

I knew a guy at university who had an ok understanding of thw english language
with a fairly Chinease accent, but would occasionally say strange slang
informal words that no one has said since Friends aired.

------
larsberg
Real-time transcription can be excellent for distributed meetings as well. We
do this on some teams at Mozilla (e.g., Servo and Rust) for all of our group
meetings, which allows people who either couldn't make it, don't dial in to
the video conference, have a poor internet connection, or where english is
their second (and sometimes third) language to participate.

Though I'll be the first to admit that I'm in awe of the skills of their
stenographer; I'm not nearly that fast, and rely on other meeting participants
to fix up my typos (especially in code) in our collaborative editor,
[https://etherpad.mozilla.org/](https://etherpad.mozilla.org/)

------
erjiang
I first encountered steno'd captions in person at Google's TGIF and it was
awesome even for the non-hearing-impaired. You get a few seconds of scrollback
in case you miss a word, which is easy to do at large gatherings.

The only thing that seems to be missing from the !!Con transcripts are
timestamps. I'm not familiar with Plover so it might already be a feature, but
being able to output one of the standard subtitle formats (SubRip, timed text,
SSA) would make remuxing an MP4/MKV of each talk with subtitles much easier.

~~~
jamesbrownuhh
If you have a good transcript, you can temporarily upload the video to YouTube
(mark it as private if you like, it only needs to be accessible to you) - then
in the "Captions and Subtitles" options, upload the plain text transcript.

YouTube will sync the transcript to the audio, (removing the ambiguity of it
having to guess what is being said, as you're telling it that - so now it
knows what words it's listening out for) and you can download the resulting
automatically timed file as an SBV or SRT type file.

It's not 100% perfect, and is something of a hack, but it usually works pretty
well. :)

~~~
PointerReaper
[http://amara.readthedocs.org/en/latest/index.html](http://amara.readthedocs.org/en/latest/index.html)
(Amara: Create Captions and Subtitles) might be of interest to you.

~~~
jamesbrownuhh
Thank you! Those docs foxed me completely but I did take a look at Amara's
main site - an interesting approach. Not a magic bullet but I love the
principle of opening up tasks like these to the crowd.

------
randlet
This would be hugely welcome at conferences for me! I'm hearing impaired and
while I can usually _hear_ the speakers (with hearing aids), I sometimes
struggle to understand the actual words they are saying (poor speech
discrimination). Seems like something that could really increase the quality
of conferences for everyone for not a lot of expense. Kudos to !!con for doing
this.

~~~
calineczka
I am in exact same situation. Would love it.

~~~
cmadan
Same here! Would love it and make me actually attend conferences.

------
Mister_Snuggles
Even though my hearing is fine, I'd love to see something like this!

Sometimes the acoustics of the room are bad, or there's background noise, or
maybe you just missed a word and need to have it repeated. All of these
problems would be solved by this system.

This kind of thing would benefit all conference goers.

------
aantix
Are there any speech to text systems out there that could do this reliably,
say 80% accuracy?

~~~
olgeni
If you actually repeat the speech into the microphone, you can go quite a bit
higher.

~~~
aantix
Id like to explore if there are any automated solutions?

What software/sdk's have you used?

~~~
jamesbrownuhh
Even the best automated solutions rely on recognising someone who is speaking
clearly and precisely, ensuring that every word is well spaced and completely
clear.

There are no automated solutions that can universally do a good job of
transcribing natural speech from people who aren't specifically "speaking to
be recognised", if that makes sense.

Maybe one day, but not yet. It's a problem waiting to be solved, so the reward
for the first who can really crack it will be substantial.

------
liamotootle
I work with CaptionAccess, which provides the type of services covered in the
article. One reason there isn't more captioning is because people associate
captioning with disabled people and accessibility.

The article began with people looking into captioning as a means of meeting an
accessibility need (a speaker losing a hearing aid) and ended with the
discovery that there are many other benefits beyond accessibility. It was a
great read and everyone won. Many planners never get to that point.

The lack of widespread real-time captioning isn't a technical or cost issue,
it's an education issue.

------
pronoiac
It really helps with accessibility! Captioning helps elsewhere, too, and there
are other methods - we've gotten crowdsourced transcripts of the Metafilter
Podcast by using Fanscribed[1], and it's been appreciated.

[1] [https://www.fanscribed.com](https://www.fanscribed.com)

------
ploversteno
A little more commentary on the difference between realtime stenography and
automated speech recognition: [http://blog.stenoknight.com/2012/05/cart-
problem-solving-ser...](http://blog.stenoknight.com/2012/05/cart-problem-
solving-series-sitting.html)

------
VeejayRampay
But how would the people creating the transcripts deal with all the technical
lingo in say programming conferences? The lexicon seems highly domain-specific
and can get quite obscure at time... That being said, this seems like it would
indeed be a GREAT addition to any conference.

------
justincormack
Anyone know any London/UK based people who can do this?

~~~
DanBC
Contact your local organisations for people with hearing impairment and ask
them for details of "speech to text" people.

There is an oddity in the UK where a person with a hearing impairment can get
speech to text paid for, but they are the client and they get to control what
happens to the text. It would be useful if text to speech paid for by the
state and used in public meetings was automatically given to the meeting as a
whole as well as the person as an individual.

------
petergreen
You should also have real-time audience feedback powered by MeetingPulse.
MeetingPulse: Make your events interactive! [http://meet.ps](http://meet.ps)

(I just thought if you make the plug outrageously shameless it will pass for a
joke, still being a plug though)

Jokes and plugs aside, it can produce a graph of audience's per-moment
sentiment, that can be overlaid on the transcript/video, so you know how they
felt at each moment of the talk.

