
Deepgram – Find Damning Soundbites - pain_perdu
http://blog.deepgram.com/new-tech-lets-journalists-find-damning-soundbites/
======
yakult
The next logical evolution is to make your own Vocaloid using publically
available speeches, and synthesize any soundbite you want.

Coming to you 2020: president candidate X may or may not have said that he
hates women, kicks babies, love hitler, and plans to nuke Wales. You don't
even care any more because your airwaves have been saturated with both
candidates (and their Vocaloids) saying all kinds of crazy shit in
disinformation and counter-disinformation and counter-counter-information
campaigns and everybody's desensitized.

~~~
Kronopath
Already exists, in some form:

[http://talkobamato.me/synthesize.py?speech_key=77ff00cb8af50...](http://talkobamato.me/synthesize.py?speech_key=77ff00cb8af500c23d74d5bbc7f05560)

------
gaius
I just wanted to check that none of us have ever said something among friends
that we wouldn't want random people or employers hearing about even years
later? Even one sentence that could be taken out of context? And we're totally
cool about being fired over it? Just checking.

~~~
pjc50
Of course we're not cool about that, but I don't think that applies to
presidents. The president isn't an "employee".

~~~
philh
I think it very much applies to presidents. There are vastly more important
considerations for a president than "has never said anything regrettable".

~~~
notahacker
But nobody uses "has never said anything regrettable" as the litmus test for
Presidency.

On the other hand "audio searches reveal the presidential candidate is telling
a massive lie about always having expressed profoundly opposition to
Controversial Thing X, and actually repeatedly ridiculed the cause in not
well-reported meetings prior to becoming the candidate" could and should be an
electoral issue.

~~~
gaius
Yes and no. If I were ever running for POTUS I'd rather not anything I might
or might not have said about radical Marxism while I was a student coming to
define my candidacy. Because this isn't about the one specific case of Trump,
once this cat is out of the bag there's no putting it back.

~~~
notahacker
How about a middle-aged career politician who is his major party's candidate
to run the economy telling a small and sympathetic audience that he was a
Marxist and the 2007/8 financial crash was an opportunity he'd been waiting
for for a generation? That's an actual example from British politics, albeit
an MP so known for making unfortunate remarks that there were probably
journalists willing to _manually_ trawl through weeks worth of footage of
fringe meetings he attends to find the most outrageous ones, but there's no
doubt a tool like this could yield better results.

For better and (mostly) for worse, dredging up muck from student days has been
around since the dawn of politics, and at least in the UK we're comfortable
enough about the fact people change for even a Conservative Party chairman to
be willing to volunteer they were a radical Marxist in their teens. Being able
to rapidly cross-reference everything a politician has said on record on a
given subject in recent years is a new and far more useful way of subjecting
them to scrutiny.

This is less about the specific case of Trump's sex remarks (which was
obviously a case of somebody knowing their secret video was dynamite and
waiting for the right time to light the fuse) and more about the difficulty of
establishing whether Trump was telling the truth about having opposed the Iraq
war all along, which is where efficient search of stuff that's already in the
public domain but not necessarily in the public consciousness comes in really,
really useful.

------
tspike
This tech, of course, has utility for many professions beyond journalism.
Given the expansive nature of our laws, it is impossible for anyone not to
break a law at some point. When it's possible to simply search a massive
database for any transgression, enforcement becomes an arbitrary endeavor.

What checks can be put in place for this?

~~~
sverige
We could stop making certain kinds of speech illegal. ( _Not_ including
"yelling 'fire' in a crowded theater.")

~~~
dredmorbius
This goes far beyond speech.

Fraud, misrepresentation, drug use, sexual harassment, conspiracy, aiding and
abetting, stolen property, just off the top of my head.

Drawn from financial records, statements, documents, phone calls, social
graphs, etc.

You're coming across as rather too narrowly focused.

~~~
jimmytidey
The speech isn't illegal.

In this case someone is indicating that in the past they've sexually assaulted
someone.

The speech is just evidence pointing towards the act, and the act is the
problem.

~~~
dredmorbius
My point isn't that the speech is illegal. It's that there are vast huge
classes of criminal conduct which, _merited or otherwise_ could almost
certainly be applied against virtually anyone you cared to, with pervasive
audio monitoring and search.

Cardinal Richelieu wrote some six lines or so on this subject. You should look
them up.

[https://en.wikiquote.org/wiki/Cardinal_Richelieu](https://en.wikiquote.org/wiki/Cardinal_Richelieu)

(Or perhaps not, the authenticity is, as many things are, disputed. But he
carries the blame. Which is perhaps the most poetic justice of this point. Or
is it injustice. Words, words, words.)

~~~
lmm
> Cardinal Richelieu wrote some six lines or so on this subject.

Which tells you that this problem is not new by any means.

~~~
dredmorbius
Scales and rates matter.

------
salimmadjd
No real "journalist" should be looking for soundbites. It's the job of special
interest groups and super PACs.

------
grogenaut
Cursory use of the app failed on all searches:

I searched for oil, mexico, automatic weapons, isis... it found "crisis", a
reference to oil without actually being on the correct time stamp, mexico got
several 50% results none of which were isis. automatic weapons got "nuclear
weapons". so basically 0/4 searches.

pretty sure that most networks use the closed caption data to find clips. NBC
had an indexed system for all of their tapes in prototype in 98 for their
whole back catalog. Not sure if it ever went online.

------
mobiuscog
'Journalists' don't need to find damning soundbites - they're given them by
the opposition.

~~~
dredmorbius
Fair point that Opposition Research is a thing.

[https://en.wikipedia.org/wiki/Opposition_research](https://en.wikipedia.org/wiki/Opposition_research)

------
dredmorbius
OK, throw this into the mix: many of you reading this have smartphones which
are voice controlled, and for which voice control is activated at all times.
In the case of Google, that processing _must_ take place on Google's
centralised servers. Siri may or may not do centralised processing (and can
operate in standalone modes). Microsoft's Cortana, Facebook's "M" (IIRC) and
Amazon's Aero are all various stylings of "Stasi in a Glade form factor", as
Maciej Czeglowski so memorably put it.[1]

Voice stores distressingly cheaply in terms of space, and with the Internet of
(broken) Things (that spy on you), odds of finding yourself surrounded by
microphones in the most unexpected locations,[2] controlled by a wide variety
of quite probably competing interests.[3] And if they cannot find what they're
looking for in the surveillance tape itself, they'll simply manufacture their
own evidence using your own phonemes[4] and video.[5]

________________________________

Notes:

1\.
[https://twitter.com/pinboard/status/732985370204233728](https://twitter.com/pinboard/status/732985370204233728)

2\. [http://www.inquisitr.com/3097029/government-surveillance-
in-...](http://www.inquisitr.com/3097029/government-surveillance-in-san-
francisco-hidden-microphones-planted-by-fbi-at-bus-stops-under-rocks/)

3\. [http://www.locusmag.com/Perspectives/2016/09/cory-
doctorowth...](http://www.locusmag.com/Perspectives/2016/09/cory-doctorowthe-
privacy-wars-are-about-to-get-a-whole-lot-worse/)

4\.
[http://www.theatlantic.com/technology/archive/2016/09/hackin...](http://www.theatlantic.com/technology/archive/2016/09/hacking-
forgeries/499775/)

5\.
[https://www.youtube.com/watch?v=ohmajJTcpNk](https://www.youtube.com/watch?v=ohmajJTcpNk)

~~~
AlexCoventry

      > In the case of Google, that processing must take place on
      > Google's centralised servers
    

Doesn't recognition of the initialization phrase "OK, google" take place on
the phone? Sending a continuous stream of audio back to google servers sounds
expensive.

~~~
dredmorbius
AFAIU (which is little), "OK, Google" is processed locally. Whatever follows
is processed remotely.

I should have mentioned voice-activated televisions as a whole 'nother class
of attack.

------
wwweston
"Every proclamation guaranteed free ammunition for your enemies..."

~~~
manifestsilence
"Why do you act like you're the smartest in the room, why do you act like
you're the smartest in the room..."

------
gh1
TL;DR A deep learning algorithm for searching words ("freedom", "Obama" etc.)
in a recorded audio/video clip.

The title is clickbait, but the claim is rather interesting. It says that it
has 80 % accuracy for transcribing an audio clip compared to 20 % for speech-
to-text.

~~~
abrookewood
I thought the figures sound a little suspect too - if automated text-to-speech
was really that bad, then presumably products like Dragon Dictate etc would
never have been successful. Obviously they have the benefit of training, but
still, 20% sounds ludicrously low.

~~~
CoryG89
I think you need to take into consideration that this is designed to be used
on voice where the speaker is not aware that they are speaking to a computer
(or that they are being recorded at all).

When people talk to Siri, Cortana, or Dragon, they take unnatural care in the
clarity of their speech compared to normal talk only meant for humans. Also
the speaker may not have been speaking directly into a microphone, lots of
background noise, etc.

All of these factors probably combine for a much lower accuracy than what
Apple, Microsoft, and Google are going to be dealing with in usual cases. Also
keep in mind they all have incentive to inflate their own products accuracy
score. Not that the same incentive doesn't also exist for this
company/product.

------
theparanoid
There's little evidence the trump clip was found using 'new tech'. Most likely
somebody in Access Hollywood remembered recording it and it went from there.

~~~
ojbyrne
The article seems fairly careful to not claim that.

~~~
throwanem
The article is a marketing piece for a technology sold by its publisher.

------
thr328982
I found something really sexist:

> _Women have always been the primary victims of war. Women lose their
> husbands, their fathers, their sons in combat._

~~~
thr328982
Here is another one:

> _Marriage has historic, religious and moral content that goes back to the
> beginning of time, and I think a marriage is as a marriage has always been,
> between a man and a woman_

