

Darpa Wants You to Transcribe, and Instantly Recall, All of Your Conversations - sk2code
http://www.wired.com/dangerroom/2013/03/darpa-speech/

======
sophacles
You know what a great use for this system would be? Taking all those videos of
talks, video tutorials, and so on, and making transcripts. I think it is cool
that there are now howto videos for everything, and most talks worth watching
are now online - however somewhere along the way text got lost. It sure would
be nice to not have to watch a whole 30 minute video when I could skim the
transcript for the bits that I want to see (timestamped) and jump the video
that point. Even more useful would be auto-annotating slides from the talk
recording. It's fun browsing slides, but more fun would be to read the talk
that went with it.

~~~
peripetylabs
Google could probably implement this with Youtube. Their automated
translations are a start, but if viewers could edit the transcripts as the
video plays, they could come up with good transcriptions very quickly.

------
tobylane
A UK expert of sorts covered this very well. Watch [0] or read a spoiler full
review [1] or google for more reviews

[http://www.channel4.com/programmes/black-mirror/episode-
guid...](http://www.channel4.com/programmes/black-mirror/episode-
guide/series-1/episode-3) [http://blogs.independent.co.uk/2011/12/19/review-
of-black-mi...](http://blogs.independent.co.uk/2011/12/19/review-of-black-
mirror-–-‘the-entire-history-of-you’/)

~~~
nathos
This episode was optioned by Robert Downey Jr. to be turned into a film.

------
webjprgm
It occurred to me that just as type setting has caused some standardization in
written letters, and the written form of language has caused standardization
in spelling, that the necessity to speak clearly to a computer could cause a
similar standardization in verbal communication. It would be interesting if we
adapt more to the machine than the machine to us, learning to be a little more
careful about enunciating and learning to use commands in speech that are
meant for the computer, like saying "comma" in a sentence.

~~~
dclowd9901
Interesting point, as I often find myself (being a human) having a difficult
time understanding many people, mostly due to their unfriendly communication
patterns (low talking, mumbling, not enunciating, drawls/accents, etc).

Seems like one of those things the general public pushes back on, if for no
other reason than virtually everyone gets offended when told they don't know
how to speak.

~~~
sneak
I use this one a lot: "I can't understand what you're saying."

I figure it's closer to showing someone "your attempt at communication has
failed" versus telling them "you don't know how to speak".

YMMV.

------
goldfeld
What I really could use today, is software at my computer that records my
whole workday in front of it, and allows me to go just to the points where I
was making sound later on, and easily extract worthy material.

I explan: I often spontaneously start reciting some poetry or prose come from
god knows where in my subconscious, in the form and meaning of a speech, a
pitch, a story or even a song. Sometimes it's just a beat or a melody. These
often turn out to be incredibly creative and surprise me, but by the time I
setup something to try and record it, surely as winter comes every year the
creativity has fled the building, and I can't record anything worthwhile. My
creativity is like a lovely squirrel who stops playing with the acorn once he
notices he's being watched.

I have looked into surveillance software now and then after such an episode,
but really, this should be easier. This should be an always on Evernote audio
note with soundwave analysis strapped on. I wonder what artists and even
writers use in their studios and studies, for surely those who don't have such
a setup are missing out on a lot of their greatest, most spontaneous material
if they're anything as eccentric as myself.

~~~
gagege
You wouldn't even have to keep recordings of the whole day. I learned a neat
trick from watching the making of the show "Planet Earth". To record some of
the amazing unexpected events, they would have a camera running constantly
that keeps about 10 seconds of HD video in memory at a time, then deletes it.
When they see something cool happen, like a shark jumping out of the water,
they just hit a button. That button tells the camera to not delete video that
is more than 10 seconds old.

That way they didn't fill up their hard drives and they got the unexpected
stuff.

What you could do is have a sound analyzer listen for speech and it would be
like pressing that button, you'd get whatever happened 10 seconds or whatever,
before you started spontaneously jabbering. :)

~~~
goldfeld
This is sound advice, pardon the pun. Still I wonder if I should ramp up the
TiVoish recorder to about an hour or so, because often these are so
spontaneous in fact, almost subconsciously uttered, that it might take me more
than 10 seconds to even realize I'm doing something weird, let alone to figure
out I'm on to something.

