
Scammers deepfake CEO’s voice to talk underling into $243k transfer? - wslh
https://nakedsecurity.sophos.com/2019/09/05/scammers-deepfake-ceos-voice-to-talk-underling-into-243000-transfer/
======
Verdex
Boss, executive, manager, supervisor. Person in charge. Of these people there
are many who want their subordinates to enact their will in the way that
requires their own personal least amount of effort.

They want to give ambiguous direction and receive exactly what they imagined
but still be delightfully surprised. They don't want to hear about the
unintended consequences of getting exactly what they asked for; they want it
to "just work and don't trouble me with the details".

Pedantry is unacceptable. Everything must be interpreted exactly how it was
meant to be. The rules are meant for others. If a subordinate fails while
breaking a rule, then they are fired. If a subordinate fails because they
didn't break a rule, then they are fired.

Ultimately, as we come closer to perfect impersonation at the press of a
button, these 'people in charge' will have a tough time adapting. They want
absolute obedience and unquestioning loyalty. Unless their subordinate was
talking to a social engineer using a deep fake.

On the other hand. The people in charge who trust their subordinates, who let
them fail and encourage them to do better, who let them question their
approach and decisions. These people in charge will find themselves and their
organizations more resilient to these attacks. "Yeah, Jeff might be a real
jerk sometimes and he's always second guessing decisions, but he doesn't
transfer $12 million just because he gets a phone call where someone who
sounds like me is upset that the money hasn't been transferred."

~~~
semiotagonal
This development is a blessing in disguise for people tired of dealing with
"decision-makers" giving orders continuously. A tedious authentication step is
a good speedbump against impulsive CEOs.

~~~
comboy
We should be able to trust what's displayed as a caller (stolen unlocked
smartphone is a different problem). It doesn't seem that hard. It's not rocket
science. I don't understand why we don't have that yet.

~~~
close04
That only helps if you're able to recognize that number. Otherwise it's just
an equally plausible number to be called from. In this case it's the voice
that tricked the person and I'm pretty sure that if your boss calls you and
demands something the likelihood to comply is pretty high. Your suspicions
about not recognizing the phone number will not even get time to properly
form.

We need for bosses to understand that this is not the way to request things
but to always go through authorized channels which implement better
"authentication" processes. This may also dampen their willingness to make
"out of band" requests.

~~~
nwallin
What millennium do you live in that when one of your contacts calls you, the
thing that shows up is a number and not the caller's name?

That's my biggest pet peeve about Android. (Totally unrelated discussion,
sorry) I can't tell it to send all calls that aren't from my contact list
directly to voicemail. They're universally spam.

~~~
close04
The millennium in which my CEO can call me from any phone in the company and
I'd be hard pressed to tell if it's legit or not? Or from a phone in another
parent or subsidiary in any number of countries that I have basically no
chance of recognizing? The simple fact that it's the CEO calling discourages
anyone from questioning it because C level executives and senior management
have a way of imposing this kind of "absolute authority" in many companies.

------
BryantD
Although the general class of problem is real, this story didn't hold up under
examination. Here's a follow-up article in Spiegel:
[https://www.spiegel.de/netzwelt/web/deepfakes-werden-
erkennu...](https://www.spiegel.de/netzwelt/web/deepfakes-werden-
erkennungsmethoden-die-naechsten-uploadfilter-a-1286373.html)

It's in German, but Google Translate to the rescue. Key paragraphs:

'But when asked, how do you know that such a software and no voice imitator
has been used, a spokeswoman replies to SPIEGEL by email: "We do not know with
100% certainty. Theoretically, it could have been a human voice imitator But
we do not assume that there are some clues (but no evidence). "

'In turn, the clues she mentions have no technical relevance; they are by no
means interpreted as evidence of a deepfake. The supposedly "first case" can
therefore at best be termed a "possible case". Which is quite symptomatic of
the debate about deepfakes and the resulting risks.'

~~~
slovenlyrobot
Possible case, furthered tempered by "infosec industry source". :) It is their
job to sell fear

~~~
sho
The entire "infosec industry" is fraudulent, in my estimation. A bunch of
assholes selling powerpoints and not a whole lot else. I've seen it a lot.

None of them know what they're talking about. Source: my direct experience

~~~
slovenlyrobot
I wouldn't say that, but they do suffer from something the IT industry as a
whole suffers from: there is/was a need, and by fulfilling that need, they
find themselves in a quandary where they're committed to something that must
continue to exist in order to avoid inflicting self-harm.

You can't go into business selling software on the assumption you'll sell a
single copy and your work is finished. Infosec is no different. There is
always this paradoxical need to ensure you never become entirely redundant,
regardless of the function you specialize in

I think a lot of the great "evolutionary difficulties" in our industry can be
phrased this way. It's why things like Microsoft Excel are laughed at even as
hundreds of thousands are trained in the art of constructing effectively
bespoke spreadsheets apps in Django by the million every year. I hope for a
correction some day, just as much as I hope I'm on the right side of it when
it comes..

~~~
sho
You misunderstand. "Infosec" is a shopping list item on govt requirements. "We
need this, this, this, and it needs to be secure!" They assume security is
just this additional ingredient that can be purchased, mixed in, and boom;
project is secure, like mixing in chili to anything will make it hot. The
infosec companies do everything they can to encourage this assumption.

It's all bullshit, and these people are pure scum.

~~~
slovenlyrobot
Speaking as a developer, I've had my ass handed to me by competent security
folk on several occasions. Unfortunately in the software industry, security
must regularly be an additional ingredient that can be purchased and mixed in,
because 98% of us are completely clueless and absolutely not thinking about
this stuff as we're trying to clear a sprint board

~~~
thephyber
Serious questions:

> because 98% of us are completely clueless

Do universities/bootcamps teach OWASP-style classes of programming
vulnerabilities these days? Some developers are curious enough to learn them
on their own, but many are oblivious.

I suspect there could be a startup idea here somewhere.

> absolutely not thinking about this stuff as we're trying to clear a sprint
> board

Does anyone have a decent tool/process for remembering all of the detailed
tasks for every type of software deliverable? I find myself in a state of
cognitive coma after sprint planning when I need to divide tasks into
subtasks.

~~~
vkou
> Do universities/bootcamps teach OWASP-style classes of programming
> vulnerabilities these days? Some developers are curious enough to learn them
> on their own, but many are oblivious.

I had exactly one course that touched on security.

A course in web programming.

The instruction we received consisted of: "If your project is not secure, it
will lose points."

I'm not sure a single person on that class had one point taken off for getting
security wrong.

Why would a university care about educating people in something ephemeral, and
domain-specific, like security, when it could instead be teaching them about
complexity theory and Djikstra, and third normal form?

------
danso
Sure, this is a troubling new variation, but it's still the kind of social
engineering that can be deflected with reasonable protocols, such as a second
channel of authentication (e.g. "Sounds good boss, can you email me the
transfer/order id to confirm things and I'll finish it from here". The
underling CEO was as vulnerable to someone imitating his boss's voice the old-
fashioned way.

Regardless of the advances in AI, I think text-based social engineering will
still be prevalent and efficient. IIRC, the Anonymous hackers who targeted
HBGary got SSH access by fooling the company's chief security officer. Sure,
they did by emailing from the company owner's account (thanks to a weak admin
password surfaced through other vulnerabilities). But the hackers didn't even
know the account name when asking for a password reset. The CSO could've
stopped things by asking to do a call, e.g. _" Hey let me call you, I need to
walk you through this part"_ or even just texting the phone for confirmation.

[https://arstechnica.com/tech-policy/2011/02/anonymous-
speaks...](https://arstechnica.com/tech-policy/2011/02/anonymous-speaks-the-
inside-story-of-the-hbgary-hack/3/)

------
ta999999171
Does this trend bring us into the promised "post-truth" world?

...or will it force humanity to finally acknowledge the very basic security
concept of public/private keys for signing important shit?

~~~
segfaultbuserr
I recently chatted about the potential use of DeepFake in future propaganda
with other HN readers, and apparently it's going to be here soon. You may want
to follow this thread,
[https://news.ycombinator.com/item?id=21269150](https://news.ycombinator.com/item?id=21269150)

But in short,

Several observations...

* A major threat is end-to-end VoIP/telephony encryption. Currently, the most common way to verify one's public key and ensure that no MITM wiretapping is going on, is reading a hash (encoded to words) aloud in a phone conversation. It's used to many protocols, such as Signal, or ZRTP by Phil Zimmermann et al. There is no real security, but it's considered a shortcut with reasonable security for most people, the assumption is that voice synthesis cannot be done in real-time yet. But with DeepFake it's disastrous for cryptography. Now, this shortcut is going to be blocked soon. Full verification, like signing your VoIP key with your long-term public key, or asking security questions (e.g. OTR's SMP algorithm) is needed (perhaps not to everyone, but it's now required for a lot of people to a greater extent).

* KirinDave suggested that, in additional to signing, timestamping will also be important. If automatic synthesis of video and audio becomes widespread, one way to prove the authenticity of the material is to timestamp it as soon as it's recorded, or even timestamping it in real-time if real-time forgery is a serious threat (I hope not). This is one of few use cases that a blockchain actually makes sense if you want minimum trust over 3rd-parties.

The final thought is that it's time to rewatch _Ghost in The Shell_ (the TV
animation series, not the movie). Released in 2000s, it portrayed and
predicted our world remarkably well, and it'll give you a lot of inspirations
of what would the future society look like. In one episode, the protagonists
realized a government conspiracy aimed to intensify a military conflict that
involves nuclear weapon, and they had the following dialogue.

> "How do we stop it? Can we post the video footage online?"

> "No. Video footage is seen as completely untrustworthy today."

~~~
filoeleven
Your second observation regarding time stamps is something I’m really
interested in. The broader topic is provenance: the ability to say “this AV
stream came from this device at this time.” I just wrote a post echoing
KirinDave’s idea but perhaps expanding the scope in this thread:
[https://news.ycombinator.com/item?id=21526278](https://news.ycombinator.com/item?id=21526278)

The more verifiable pieces of data that you can associate with a recording,
the more you can trust that it came from when/who it claims to. If I send a
recording that can be tied to:

\- a timestamp service with my request for one stored in a blockchain,

\- a tamper-evident device that signs the data with its own private key,

\- my own private key

then you can have a high degree of certainty that I am the one who recorded
and sent the content and that it has not been altered. I could still be tied
to a chair while a voice actor impersonates me and forces me to send it. This
is after all basically the modern equivalent of a proof-of-life photo with the
kidnapped victim holding today’s newspaper. But it’s a lot more effort for
someone to go through compared to having none of those other guarantees.

It’s a fascinating topic that will only become more relevant as deepfaking
gets easier. Whoever makes the first device/system to do this, if it’s not a
flawed premise, will make a pretty penny.

------
allovernow
Those "Grandma send money to your grandson to bail him out" scams are about to
get a lot more interesting. It's one thing to convince your grandparents never
to send money to random people calling on your behalf, but if the conversation
happens in your voice? How do harden against such an attack when dealing with
people who are technologically illiterate?

Perhaps this opens up a market for secure identity verification that is
accessable to the layman...

~~~
cpitman
To be fair, these scammers _already_ claim to be the grandson. My grandma got
a call from me, from jail, in Mexico. I guess I had a little too much to drink
at a wedding, and I had a cold, so my voice was a little off. And I
desperately needed to post bail in a foreign country.

It wasn't the voice that tipped my grandma off. Just that I do not drink. So
that was close.

------
tjpaudio
What I don't get about these scams is what is preventing them from working
with the bank to reverse the fraudulent transfers? Sure I get that in these
instances money is usually moving between the banking systems of two different
countries, but just like we have extradition treaties, how is this not a
thing?

~~~
mrlala
Well, the issue is I think specifically wire transfers are designed to NOT be
reversible? I'm not saying they can't reverse in case of fraud (like this).
But it's probably way more difficult because the money would have been
IMMEDIATELY available in the fraudsters account, in which case they probably
knew to move the money somewhere else immediately / withdraw it (in some
regards, I'm not sure how you withdraw $200k cash).

Anyway- that's my guess why this is difficult to reverse.. if it was easy to
reverse, both sides would be very suspicious of each other during a large
transfer.

------
zamubafoo
This and other discussions on HN make me think on the fact that so much of
your voice has been recorded by companies when you call their support lines.
Are these troves of voice recordings stored safely with appropriate levels of
access control?

I mean most people can usually pick up on the difference between a voice
recording and a generated voice sample based on the recording. Have there been
studies where generated voices are then subjected to telephone compression and
layered with background noise to appear to be a phone call from a car?

If DeepFake voices only need 5 second clips to produce good enough versions of
anyone's voice and any discrepancy in quality could be attributed as a bad
telephone connection or masked with background noise, is anyone really safe?

~~~
octbash
> This and other discussions on HN make me think on the fact that so much of
> your voice has been recorded by companies when you call their support lines.
> Are these troves of voice recordings stored safely with appropriate levels
> of access control?"

"When we said 'quality and training purposes', we were referring to training a
neural network."

------
crankylinuxuser
One way to defeat this (kindof) is to institute a callback on the internally
shared number.

So it would have went like this:

    
    
        1. scammer deepfakes CEO voice
        2. demands X dollars transferred elsewhere
        3. Employee hangs up and calls back using pre-shared phone#
        4. Gets confirmation
    

Of course this can be somewhat thwarted by sim-stealing attacks. But this is
just another _layer_ of defense, that stops another 99.xxx % of attacks.
Enough of those and these attacks would be infeasible.

~~~
parliament32
Or you just don't authorize multi-million dollar unreversable transfers with a
phone call? Processes for this kind of thing are in place for a reason.

~~~
crankylinuxuser
Well, the obvious counterexample would be Fintech. Calls and messages over
approved SEC communications methods can be and are normal operating procedure.

But in this article's case, we're talking about $.243 million , which flies
counter to "don't authorize multi-million dollar unreversable transfers".

Not sure what your point was here.

------
ipoopatwork
My bet is on a voice actor -- or an underling invoking deep fake to cover his
arse

~~~
vkou
Transferring $230,000 to your cousin's account, and hoping to stay out of
jail, by telling everyone that a deep-fake CEO told you to do that is an
incredibly dumb way to steal.

I mean, people do _incredibly_ dumb things, so it is certainly possible. I
doubt it, though.

------
Miner49er
How do they know it was AI and not a voice actor?

~~~
vectorEQ
its a random assumption / hypothesis. maybe even it was the real ceo... :D
that'd be too funny

~~~
cyborgx7
Be an effective way to get some money.

------
rurp
> Bafflingly, though, Dessa said that its team created the Rogan replica voice
> with a text-to-speech deep learning system they developed called RealTalk,
> which generates life-like speech using _only text inputs_.

I'm missing something here. How could they replicate the sound of his voice
using _only text inputs_ [emphasis theirs]?

~~~
tantalor
The voice model is trained on voice inputs. The text-to-speech synthesis is
based on text only, rather than audio of a different voice. So, together you
have voice and text as inputs.

~~~
p1esk
Do you know if anyone done this without using text? I mean training a model
only using voice samples from different people, and using voice input only
during inference.

------
scarmig
At some point, we are all just going to blame deepfakes for all of our
foibles.

"Honey, I sent $20k to that camgirl because I thought she was you! Deepfakes!"

------
inetknght
Related discussion (separate article) from yesterday:
[https://news.ycombinator.com/item?id=21525878](https://news.ycombinator.com/item?id=21525878)

------
ryanlol
Odds are there was no deepfaking involved here, nonsense clickbait title.

------
fitzroy
Almost everything digital in audio seems to have been a decade or two ahead of
video (non-linear editing, lossless/uncompressed quality, digital synthesis of
elements, distribution, etc), primarily due to the bandwidth and processing
power requirements being much greater for video.

So I'm curious how, with regard to deepfake technology, video seems to be well
ahead of audio. Is audio deep fake technology simply less interesting to
people? Are listeners far more sensitive to voice not being perfect? Is the
human input just a lot less helpful in modeling the output with voice vs video
(where the initial human input is only slightly more helpful than just
synthesizing from text-to-speech)?

------
donohoe
There is zero information to backup the claim this was a deepfake and not just
impersonation.

~~~
the-dude
I haven't even seen information backing up the claim of the calls. It could be
all made up.

------
mtw
If this worked, this scheme is going to be widespread next year, making phones
even more useless.

It's ok to get spam and scam for emails if you're not paying anything but
there's no reason to get the same for phones if you're paying top $ for it.

------
jonplackett
Wouldn’t have taken much effort to verify this. Doesn’t even sound like he was
that sure it was his CEO, just says he had a German accent and a similar
melody. You’d think for a transfer this large he would want to be a bit more
certain than that!

------
rajekas
My dystopian mind thinks it's only a matter of time when a deepfake launches
fire and fury across the globe. Imagine the Cuban Missile Crisis with
deepfakes. Yes, I know there are layers and layers of safeguards, but....

------
newnewpdro
Sounds like a good story for some embezzlment.

------
scarmig
Reminds of when some scammer bilked Google and Facebook out of $100MM through
fake invoices. I'm not sure deepfakes is really likely or a differentiator
here.

------
paul7986
At present day to fight these tricks like your bank calling you or a contact
you know calling you both saying fraud...just hang up and dial their phone
number directly. Call your bank yourself and ask to be transferred to the
fraud department...dial your bosses number directly, etc.

Until the scammers are able to take control of your phone number which I hope
never happens I think the above is a good solution to fight this junk.

~~~
loopdoend
That latter part happens all the time, it’s called SIM swapping.

------
dazhengca
How could there be enough samples of his voice? Seems like quite a jump to “it
was a deep fake” unless there’s more info they haven’t revealed

~~~
ry4nolson
From yesterday:
[https://news.ycombinator.com/item?id=21525878](https://news.ycombinator.com/item?id=21525878)

Only requires 5 seconds of voice audio to synthesize believable speech.

~~~
danso
The cadence of the synthesized voices are still noticeably artificial, even
for the short demo phrases. This is not to say this isn't impressive. But how
much does this method improve when it isn't constrained by a 5-second sample?
If we feed it several hours of public speeches from Martin Luther King Jr, or
hell, tens of hours of audio from President Obama or Trump – will it have the
same artificial cadence, even if the tone and pitch of the imitated voice is
accurate?

------
parliament32
"Deepfakes voices!" makes a cool headline but there's a 90% chance either the
employee or the CEO is in on it. Right now the only real fact is "the employee
claims someone who sounded like the CEO called"... it's far more probable that
either the employee is lying or the CEO actually made the call.

~~~
p1esk
Not sure what makes you skeptical. The technology to do this sort of scam was
ready at least a year ago. If anything I’m surprised it hasn’t happened sooner
(it probably did, we just haven’t heard about it). If making money transfers
is part of your job and your boss calls you to make an urgent transfer for a
reasonable amount, why wouldn’t you do it?

~~~
parliament32
Because the simplest explanation is often the right one. Do you think it's
more likely that some scamming group somehow got thousands of hours of the
CEO's voice, trained a world-class voice model, got internal info about how
the CEO usually calls and requests transfers, found the correct employee to
target, spoofed the incoming number... or was the employee's drinking buddy
like "hey, just transfer X to this offshore account and say the CEO called
you, we can pin it on that deepfakes stuff we saw in the news"? Or even more
likely, CEO wants to embezzle some money so he makes a call for the transfer
then feigns ignorance?

If there was any sort of proof that this happened, like a recording, then
sure. But there's no facts in the article other than "analysts suspect..." and
an attention-grabbing headline.

Maybe I'm just jaded but after everything that's happened over the last few
years, but I take everything the media publishes with a grain of salt these
days.

~~~
p1esk
My understanding is you don’t need 1000s of hours of CEO’s voice. You train
the model on many voices, then fine tune it on a few hours of the target
voice.

Regardless of what happened in this particular case, this type of scam is now
possible, feasible, and will get easier to perform with time. We should be
concerned.

------
flimflamm
Maybe it was genuine screw up and the deep fake story was created to cover
ones ass.

------
throwaway35784
Trust is under full on assault by the nefarious among us.

I ask you, how will we be able to trust moving forward? Will all transacting
be in person? We can fake id's too.

How will we be able to verify reality in this age of deception?

------
tastroder
Previous discussion from 2 months ago:
[https://news.ycombinator.com/item?id=20864659](https://news.ycombinator.com/item?id=20864659)

------
dmix
A simple predefined verbal password that updates once in a while would be
sufficient. Especially if you're transferring hundreds of thousands based on a
single phone call.

------
the-dude
So the CEO was caught embezzling 243k and he blamed it on a deepfake?

Genius.

------
yellow_lead
I have a slightly hard time believing all possible responses could be
regenerated or done on the fly with a true deep fake. Maybe possible with
great planning.

~~~
jbattle
I assumed a human was listening in and typing responses for a text-to-voice
program to speak out

~~~
Liquix
It'd be pretty tough for the operator would have to wait for the other party
to finish their thought, comprehend said thought, formulate an appropriate
response, and then type it out on a keyboard.. Whilst maintaining the expected
rhythm of a real-time phone conversation

~~~
jbattle
Faster than writing an AI that could pass the turing test ;)

But yeah a voice actor seems like an even easier (if less press-worthy) way to
run this grift

------
epa
Poor internal controls on the CEO's team part.

~~~
ghaff
Yes, in this case. On the other hand, it's not hard to see that we're probably
increasingly headed towards a future with more rigorous identity and
authorization verification for at least some actions and purposes. And those
more rigorous processes will inevitably lead to more edge cases that force
people to go to a lot of trouble to verify who they are. (Yes, of course we
can get your access to your Google account back. Just bring a notarized birth
certificate and two other forms of ID to our Mountain View office between
10-11am on a Friday.)

~~~
epa
I think we will do better, but i'm an optimist.

------
mitchtbaum
The Motel 6 Saga! (Sequel)

[https://youtu.be/6ePFuTLYTaE](https://youtu.be/6ePFuTLYTaE)

------
rhacker
I tell banks and other people that I will call them back to verify the
authenticity of the incoming call.

------
slowenough
Are there ML models that can detect a faked / synthesized voice, even if we
can't?

~~~
LoSboccacc
the models to detect if it's fake are used to reinforce it's training
[https://en.wikipedia.org/wiki/Generative_adversarial_network](https://en.wikipedia.org/wiki/Generative_adversarial_network)

~~~
Scoundreller
The best question is: is there something computationally easy to check but
computationally difficult to generate?

~~~
LoSboccacc
probably but whatever will be used it will just be the next generation input,
it's going to continue escalate indefinitely.

------
TwoBit
Anybody who will give away 243K based on a phone call alone is a moron.

~~~
cpitman
Working with much larger sums of money quickly distorts your perceptions of
what is "a lot". 243k is under the signing authority of a random Director
level manager, and to a CEO would likely feel like a rounding error.

------
exabrial
Combine this with callerId spoofing and it'd probably fool me.

------
jacquesm
Every company should pre-emptively attempt all known forms of CEO fraud every
couple of weeks as a kind of vaccination against the real thing.

~~~
ttul
This is a thing already. Large banks spend up to multiple billions per year on
security (for example) and even a 3,000 person enterprise can easily drop
millions a year. Red-teaming is an absolutely necessary aspect of security
these days.

------
_bxg1
And so it begins.

------
ttraub
Artificial voices must have improved a lot in the past 2-3 years. None of the
ones I've ever heard would pass muster. Something weird in their intonation, a
kind of sameness or monotony in their speech i.e. lack of excitability, a bit
too perfect and precise.

I just went to cereproc, one of these companies advertising a very realistic
synthetic voice, and none of their voices convinced me at all, though I have
to admit it was pretty good.

When they get so good that they are indistinguishable except through Turing
tests, and maybe not even then, we'll all be in trouble. I somehow expect that
we haven't yet reached that point, though it can't be long in coming.

~~~
malux85
You read the article right? It literally happened. It’s already good enough to
fool people who aren’t paying attention.

~~~
unfunco
> "Analysts told the WSJ that they believe that artificial intelligence-
> (AI)-based software was used to create a convincing imitation of the German
> CEO’s voice…"

> "If this turns out to indeed be a deepfake audio scam…"

It didn't literally happen, they _think_ that's what happened. I'm sceptical
personally, I didn't realise deep fake audio could be done in realtime now?
And who is this CEO that must have hundreds of hours of publicly available
audio that the voice could be trained on?

~~~
armagon
Take a look at this article that hit HN yesterday:
[https://news.ycombinator.com/item?id=21525878](https://news.ycombinator.com/item?id=21525878)

