
Did Google Fake Its Big A.I. Demo? - robg
https://www.vanityfair.com/news/2018/05/uh-did-google-fake-its-big-ai-demo
======
teachrdan
From TFA:

"As Axios[0] noted Thursday morning, there was something a little off in the
conversations the A.I. had on the phone with businesses, suggesting that
perhaps Google had faked, or at least edited, its demo. Unlike a typical
business (Axios called more than two dozen hair salons and restaurants), the
employees who answered the phone in Google’s demos don’t identify the name of
the business, or themselves. Nor is there any ambient noise in Google’s
recordings, as one would expect in a hair salon or a restaurant. At no point
in Google’s conversations with the businesses did the employees who answered
the phone ask for the phone number or other contact information from the A.I.
Further, California is a two-party consent state, meaning that both parties
need to consent in order for a phone conversation to be legally recorded. Did
Google seek the permission of these businesses before calling them for the
purposes of the demo? Was it staged in the simulated manner of reality TV?"

0\. [https://www.axios.com/google-ai-demo-
questions-9a57afad-9854...](https://www.axios.com/google-ai-demo-
questions-9a57afad-9854-41da-b6e2-5e55b619283e.html)

~~~
sterlind
Two-party consent laws raise an interesting point: how could Duplex operate
legally in California?

Google probably wants recordings of each call, so they'd have to include a
preamble. Phone systems that say "Your call may be recorded for quality
assurance purposes" usually:

1\. Make you press a button before you speak to a human, thereby recording
your consent.

2\. Don't record before that point (since you're in the phone tree.)

Seems unlikely Duplex will pass as human, for legal reasons.

~~~
bb88
> Two-party consent laws raise an interesting point: how could Duplex operate
> legally in California?

They're not testing it in California, is one option. Two party consent is not
the standard in most states.

[http://www.dmlp.org/legal-guide/recording-phone-calls-and-
co...](http://www.dmlp.org/legal-guide/recording-phone-calls-and-
conversations)

~~~
gnicholas
Honest question: is one-party consent satisfied if the party that has "given
consent" is an AI bot? I could see how someone would argue that there was only
one party on the call capable of consent, and that he/she (the human caller)
did not give it.

Another hypo: What if my AI bot calls the restaurant and talks to their AI
bot? Is that a call with zero parties? Can consent for recording be given at
all?

My guess is that it depends on the definition of "party". Cue the lawyers.

~~~
amenod
<IANAL> I would assume that the bot is operating on behalf of someone and it
is their consent that matters. So if bot gives the consent in their name, it
would be the same as if they gave the consent themselves. </IANAL>

~~~
aurailious
So that would mean that the person asking the bot to give consent or would
that mean Google is the one giving consent? Or is the person giving permission
to Google to give consent? Would Google give permission to the recording of
the call?

~~~
shaftway
I'm also not a lawyer, but....

I believe generally the US courts have decided that the person who triggers
the action is the pertinent party.

Something similar has come up in the firearm fabrication community. There are
companies that sell things that are legally paperweights, but are 80% of the
way towards being a firearm (e.g. [1]). There are CNC mills that come with
programming to take one of these "80% lowers" and finish it, making a fully
functional firearm, or at least the part that is considered a firearm by the
BATF; other parts can be ordered and shipped online (e.g. [2]).

In this case it's not the builder of the machine or the writer of the code
that instructs it that is the "manufacturer" of the gun. It's not even the
person who placed the unfinished lower in the enclosure and bolted it down.
Instead it's the person who pushed that button.

I'm going to guess that this will be similar. The user who pushes the button
(or asks assistant to make an appointment on their behalf) is likely to be the
one giving consent.

[1]: [https://www.80percentarms.com/](https://www.80percentarms.com/)

[2]: [https://ghostgunner.net/products/ghost-
gunner](https://ghostgunner.net/products/ghost-gunner)

~~~
gnicholas
I'm the person who originally posed this question, and I actually am a
(former) lawyer. I think these are all good analogies, and courts would think
about things like this.

But neither the laws nor the judges will be uniform. The laws use different
words to say slightly different things, and will have different legislative
histories (the record from the officials who voted them into law).

Some judges will look to the actual words to interpret the law, and others
will look to the legislative history. Still others might desire a "living
Constitution" approach — applying the laws in a way that they think makes
sense in today's world.

Considering the dozens of laws and thousands of judges that could opine, there
will likely be considerable uncertainty in this area for years to come.

------
freyir
Google refused to provide the businesses' names to Axios (even with a
guarantee not to publicly identify them), and refused to answer whether the
calls were edited.

If the calls _were_ real, I wonder how many recordings did it take to get
these perfect examples? How many times were the appointments scheduled
correctly, and how many failed?

But this was just a flashy demonstration to drum up excitement over the Google
brand. Until they write up a paper or release it as a product, we're unlikely
to know how well it really works.

~~~
gnicholas
Precisely. What happens if the system makes a mistake and tells you you're
confirmed when that's not the case? Then you show up for a reservation that
doesn't exist and you wish you'd spent 3 mins making the call yourself.

The question will be what is the net time savings? If it works 99 out of 100
times, maybe that's good enough for many mundane tasks. But if it's 75 out of
100, that's almost certainly not worth it.

~~~
utopcell
I agree. These kind of systems need 99% precision.

~~~
Proziam
I'm not even sure 99% would be enough for my tastes. Imagine an important date
(anniversary or similar) with your S.O. If there was even a 1% chance that
didn't happen because I was too lazy to call I'd certainly be hesitant to use
it. And if I can't use it habitually for the task of scheduling, I probably
will continue to do what I do now and just make the call.

~~~
snowwrestler
I think human perception of technology is often dominated by annoyance. From a
little bit of research I did years ago, 99% accuracy was nowhere near good
enough to satisfy customers of handwriting recognition--that's about one word
mistake per paragraph.

Consider the new Apple keyboards, which attract hundreds of intense complaints
every time they come up here on HN. Apple Insider [1] estimated the percentage
of service tickets that were attributed to keyboards:

2014 - 5.6%

2015 - 6%

2016 - 11.8% (first year of new keyboard)

2017 so far - 8.1%

So, an approximate doubling of the prevalence of keyboard problems in the
first year has been enough to convince a lot of people that the entire product
is an unmitigated disaster.

One of the very interesting facts in that Apple Insider study is that the
total number of tickets actually _decreased_ from 2015 to 2016. If the
keyboard was actually a design disaster, you would expect the total number of
warranty service tickets to climb, but they didn't.

2014 - 2,120

2015 - 1,904

2016 - 1,402

2017 - (incomplete)

[1] [https://appleinsider.com/articles/18/04/30/2016-macbook-
pro-...](https://appleinsider.com/articles/18/04/30/2016-macbook-pro-
butterfly-keyboards-failing-twice-as-frequently-as-older-models)

------
Ivoirians
Googler here. I've seen a few other demos, and there was even an internal
dogfood, although I didnt use it myself. But I guess you could still choose
not to take my word for it, since I'm just a random guy on the internet. Just
wait til it launches, I guess.

Also, I distinctly remember on multiple occasions having to ask "is this <XX
restaurant>?" when making reservations in the past. Does this really never
happen to people that it feels more likely that Google is faking high-profile
demos in the keynote that it can't actually deliver on? And wouldn't they want
to pick demos that didn't include or redacted the name of the restaurant
anyway?

~~~
gamblor956
Just tried calling a bunch of restaurants in my neck of the woods. They _all_
identified the restaurant right off the bat, along with the person speaking.

Is it some wierd Bay area thing that businesses don't identify themselves when
they answer a call?

~~~
hunter23
I have noticed that some Chinatown restaurants don't tend to identify
themselves when I call in. I think part of it is a language / cultural
barrier. It's a lot different calling a chain where they have huge manuals on
how to interact with customers vs an immigrant family run business. Is that
not the case for your Chinese restaurants?

However, more likely I think they edited out the initial introduction to give
the businesses privacy. They also probably told the businesses beforehand that
there was a chance they would be calling in the future with bots to get around
California 2 party laws on recording (or maybe they got approval afterwards
although I'm not sure that's legal).

~~~
Piskvorrr
That's the point. "What you're about to hear is an actual call." ("...what
you're not about to hear is that we have heavily edited it"? Or perhaps
"lightly edited it?" Well nobody knows - and since Google aint' telling, it is
reasonable to assume the former).

------
lgleason
I'll say this again. Currently I have Google home hooked up to my home
automation system. It has trouble correctly identifying commands for things
like turning on my desk light and understanding South African accents when
friend visit from there. My Alexa does a lot better with that. Yet somehow I'm
to believe that they have made this massive step forward. Google used to under
promise and over-deliver. Now it has been taken over by marketers who tend to
over promise and under deliver.

The truth is that they still make the majority of their profit from search,
but need to look like they have something to back that up to keep the stock
price up. Between that and Waymo it really felt like they don't. Android and
the Chomebooks are promising as technologies, and I like they refinements they
are offering, but the monetization strategy doesn't seem to be clear strong.
It is definitely not as strong as Apple's ability to make profit from selling
Macs and I-Phones...

~~~
plicense
Sorry for the irrelevant comment - I've just been looking at your past
comments. Looks like your gripe with Google started 4-6 months ago and been
continuing ever since. What happened?

~~~
jessaustin
The parent comment itself describes dissatisfaction while using Google
products. Perhaps this use, or at least the dissatisfaction it inspires, began
then? Regardless, this sort of questioning seems unfair.

~~~
plicense
There are certain things in the original comment that I agree with(in certain
situations I too have found Alexa to be better performing than Google Home),
certain things maybe not so much(like this statement "Google used to under
promise and over-deliver. Now it has been taken over by marketers who tend to
over promise and under deliver.")

One of the things I've decided to do these days when I find things that make
no sense when I read them, is to figure out if the commenter had any pre-
existing biases.

------
jpm_sd
"Any sufficiently advanced technology is indistinguishable from a rigged
demo." \-- attributed to Andy Finkel

~~~
blablabla123
So true. Especially when people hold presentations that don't do this on a
regular bases it's often beyond recognition a) what is going on the screen and
b) what the point is.

~~~
zwkrt
A sign of a good manager/exec is their reaction to a demo. If they say "I
don't believe you, this looks magical", they might actually be thinking about
what it would take to bring your demo to production. If they say "wow that's
great, amazing work guys!" it is time to worry.

~~~
blablabla123
> wow that's great, amazing work guys!

I guess it's nice to give great feedback when the work was great. I don't see
why people sometimes want to obsessively find a problem that doesn't really
change things.

But generally I totally agree with you. When the manager only makes positive
comments and doesn't engage, the person is not being helpful at all.

------
jlebar
> Nor is there any ambient noise in Google’s recordings, as one would expect
> in a hair salon or a restaurant.

Second recording at [https://ai.googleblog.com/2018/05/duplex-ai-system-for-
natur...](https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-
conversation.html) has very clear background noise.

------
dictum
No _tu quoque_ intended, but I find the history behind the original iPhone
introduction fascinating: [https://www.nytimes.com/2013/10/06/magazine/and-
then-steve-s...](https://www.nytimes.com/2013/10/06/magazine/and-then-steve-
said-let-there-be-an-iphone.html) (spoiler: it required some fakery and even a
violation of FCC frequency rules)

~~~
lbenes
Great article, thanks for sharing.

spoiler: FCC part was clever, but what fakery? Are you talking about the
"golden path" or something else I missed after a quick read?

~~~
ddoolin
They faked the phone's signal bars to always show five bars, since it was
expected that the radio software would crash somewhere in the 90-minute
presentation.

~~~
mcphage
Was it a presentation about their novel new signal bars?

------
utopcell
Not only were these calls not faked, the dogfooders were required to go out
and honor their reservations.

Internally, the criticism is brutal, because we want free-form, general,
conversation abilities, and we don't consider the demo to be perfect.

But fake ? IMO, if the team wanted to fake a demo, they could had done that
years ago when the project started.

But we are Google. We don't fake demos because we don't have to.

[edit: To add to what I said: If you think that a team at Google could fake
something at this scale and have the face of the company back it at our most
high-profile event of the year, just think of the aftermath internally. The
code is available for all of us to study. The design docs are there for us to
read. There were thousands of engineers that at least peeked at the codebase
even on the same day of the demo.]

~~~
robg
Then why not release or discuss the basic details? The locations need not be
identified publicly to confirm how it all went down.

~~~
utopcell
Not sure why not or if that even is the case. I am curious myself and I will
try to find out.

------
prkvs
Google claims that restaurant booking quoted in the blog[1] was done using
Duplex.

someone should reverse lookup this image[2], match it with bay area based
restaurants and find the restaurant then

[1] [https://ai.googleblog.com/2018/05/duplex-ai-system-for-
natur...](https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-
conversation.html) [2]
[https://3.bp.blogspot.com/-Arp_jhtS4F0/WvD7KzLvDRI/AAAAAAAAC...](https://3.bp.blogspot.com/-Arp_jhtS4F0/WvD7KzLvDRI/AAAAAAAACrs/CgWeSM1I3okK5uyCt0d9BsYhtKBnpKuRACLcBGAs/s1600/Yaniv%2BMatan%2BFinal.jpeg)

Edit: It looks like Chinese/Korean restaurant from Image.

~~~
Piskvorrr
Your wish has apparently been granted.
[https://daringfireball.net/2018/05/duplex_booked_restaurant](https://daringfireball.net/2018/05/duplex_booked_restaurant)

~~~
dmix
The restaurant [https://m.yelp.com/biz/hongs-gourmet-
saratoga](https://m.yelp.com/biz/hongs-gourmet-saratoga)

------
awesomepantsm
This reminds me of the advice: Don't write articles that can be answered with
a single word "no".

Unlike startups, it's not like Google's business depends on investors
supporting lofty goals. There would seem to be no benefit to faking a demo
like this.

Not revealing the business location is likely just because they consider their
business partners to be business data that they don't want to give away to
competitors. Clean auto could easily just be done via a simple noise filter,
just for the sake of the demo. When you have a little noise in your ear, it's
not bad. But when you need to broadcast sound to an entire stadium, you need
to rebalance it. This story is a load of nonsense.

~~~
MBCook
How do we know the answer is no? Maybe Google’s software can do what they say
but they didn’t demo that.

Thay supposedly made a call (won’t release proof) to a restaurant (that they
won’t name) and talked to a receptionist (who didn’t mention the restaurant
name) that didn’t ask what time the reservation was for. And you couldn’t hear
the restaurant in the background.

The demo is really suspicious. They deserve to be called on it. For all we
know that was a recording of a fake training/test call with a Google employee.

If they manipulated the audio in some way (cut out an intro, filtered out
noise, etc) they just have to say so.

Google deserves this kind of scrutiny. They’re a MASSIVE company, they should
be able to handle these kind of questions about new products.

Especially those that are supposed to be released in the next few months.

~~~
jessaustin
A smoother fake would have bleeped over the supposed restaurant name.
Something to look forward to next time!

------
muglug
> The snippets of conversation during Pichai’s demo, which can be heard in
> this clip, seem too polished and unrealistic to be real.

Reminder that Duplex explicitly uses Concatenative Text-to-Speech - i.e. they
record humans saying phrases, and just play those soundbites back where
appropriate. Sort of like a chess AI storing a dictionary of opening moves.

~~~
FelipeCortez
I believe most concatenative TTS systems nowadays are diphone/phone based,
rather than playing back full phrases. But being phrase-based could explain
how good the demos sound

~~~
PeterisP
If you want to be able to say arbitrary text, then you pretty much have to go
lower than word level, because the number of different words is effectively
unbounded. However, if you're speaking a controlled language where there are
clear limits to what the system would ever want to say, then there's no
problem to record enough high quality words or whole phrases to concatenate
them into something nice sounding.

------
jccalhoun
Remember, actually calling someone is the last resort. If google can make a
reservation for you online then it will do that first.

If they have to call, I suspect that it isn't 100% without human intervention.
I am reminded of "Samantha West" the telemarketer "bot" that will robocall
people and respond to people. It turns out it isn't a bot at all but someone
pushing buttons on a soundboard to play the appropriate prerecorded response.
[http://newsfeed.time.com/2013/12/17/robot-telemarketer-
saman...](http://newsfeed.time.com/2013/12/17/robot-telemarketer-samantha-
west/)

It would be pretty easy for the small number of times that google can't make a
reservation online for you to pay some people in a call center to do this.

------
kozikow
I think it's obvious to anyone who ever did AI/ML demo. Let him/her who never
picked the best example out of 20 for their demo cast the first stone.

~~~
telltruth
Not these days. Without code and reproducibility, most researchers would
ignore your amazing demo. You should check out recent feud between Yann LeCun
and Sophia.

------
gwbas1c
I find that it's easy to get a proof-of-concept together. The challenging
thing is getting all the details right for a working product. Or, to put it
mildly, people who don't understand easily confuse a demo with a working
product.

It's probably just a proof of concept that doesn't work very well. They
probably just played back one of the few attempts that it worked very well.

I even remember, when reading the article, a lot of caution in overestimating
the state of the technology. So, given their disclaimer, I'm giving them the
benefit of the doubt; and I'm thinking that everyone else overreacted because
they don't understand software development.

------
jonknee
Google's in a lose lose here, if they relent and identify the businesses the
press will descend on them to get quotes about how they feel "getting tricked"
by a bot and so on. But if Google hadn't left out the identifying information
in the first place the businesses would have been inundated by callers and
Google would have been criticized for letting a small business get "doxed"
like that. And finally if they don't release any info there will be more
stories about how it's fake.

I guess they could have the bot call Dan Primack at Axios and clear it up...

------
RA_Fisher
Haha, the old Mechanical Turk trick!
[https://en.m.wikipedia.org/wiki/The_Turk](https://en.m.wikipedia.org/wiki/The_Turk)
With all the smarties on the Google Brain team, I'm surprised they need it.

------
tehsauce
I think the more import point rather than whether or not it was "fake", was
that it was certainly cherry-picked. This is an issue with a lot of machine
learning and AI presentation and publications today. If you want to show how
useful something is, there needs to be a functional demo.

------
czbond
Who cares. Even if it is not a "real restaurant" \- as long as it is even
Duplex calling a Google employee pretending to be a restaurant, does it really
matter? [note: I did not watch the actual reveal, so as long as they didn't
make the exact claim]

~~~
scarface74
Yes it matters. One of my go to "Hello World" type of programs that I use to
write when learning a new framework or language was an Eliza like chatbot.
Whenever I chatted with it, it worked amazingly well even when I was trying to
test for corner cases.

The minute someone else used it, the illusion was shot. It's really easy to
make a chatbot work when you keep it on the rails - even when you're not
intending too.

------
otakucode
Voice synthesis is really fantastic at this point. But there is a critical
flaw in every single actual USE of it. They are obsessed with their 'cloud'
and don't do any of the processing locally. This guarantees the telltale
latency lag which can not be remedied. Conversation with any synthetic voice
will always be a beat off due to the latency. It's the same as the annoying
wait while Alexa or Siri sends your voice data off to a cloud to be processed
(which is utterly unnecessary for the recognition part). You will always know
you're talking to a robot because the brain isn't in the head you're talking
to and the speed of light is only so fast.

~~~
djrogers
> It's the same as the annoying wait while Alexa or Siri sends your voice data
> off to a cloud to be processed (which is utterly unnecessary for the
> recognition part).

True - in fact the iPhone actually does transcription on-device, AFAIK the
cloud loop is for validation and to get answers.

Dont believe me? Put your iphone in airplane mode and use the dictation button
on your keyboard!

------
jayd16
This is like being suspicious about why the photos in an Apple keynote look so
good. Almost like they were hand picked to look good?

~~~
Kenji
This. I believe the conversations were genuine, but I also believe they were
two of hundreds or even thousands. Cherry picked. I am absolutely certain that
some calls ended in complete confusion on both sides.

------
angryasian
Just curious but does it really matter for a tech demo. IMO even if the person
answering the phone was scripted but the AI portion was real, its still
impressive to me.

~~~
elicash
Is playing an edited video or audio piece a "demo?"

They clearly portrayed it as a real, unedited call. If they were dishonest, of
course that matters. Arguably ethics don't matter as much as the underlying
tech, I suppose. But it still says something about the individuals involved.

~~~
ocdtrekkie
Probably more interesting regarding their ethics is that they've seemingly
declined to answer most questions about the demo. They've declined to answer
whether or not Duplex records calls, they've declined to confirm who they
called with it, they declined to confirm if they tested it in a state that
recording the calls is legal...

They seem _incredibly_ dodgy for something they announced openly in front of
thousands of people.

~~~
282883392
But do they really need to answer those questions? Clearly they don't want to
publisize the establishments that they called. The most likely record calls
(although they probably don't know if the final product will) and why would
they confirm that they have done something illegal.

It doesn't hurt them to not answer these questions, and they definitely don't
want to answer some of them, so why would they?

~~~
elicash
It could be that the audio is only edited in those first several seconds. That
a normal call goes "Hello this is [business name]." Then Google starts
recording as it replies that the call is being recorded. And then starts
ordering.

And that's likely why, weirdly, no business identifies itself in the audio.
They don't want to admit to ANY editing because it would open up MORE
questions.

------
fwdpropaganda
Prediction: Google will come out and say that BOTH the voices were google
assistants talking to each other.

~~~
paulcole
I thought this was going to be the big twist reveal at the end of the demo
videos.

------
lern_too_spel
Your employer can record your phone calls at work. Google's testing of this
service would have simply required Google to obtain permission from the
restaurants enabled in its tests.

~~~
iandanforth
[https://kellergrover.com/news/privacy-violations/ok-
record-p...](https://kellergrover.com/news/privacy-violations/ok-record-phone-
call-california/)

I think this would require that a grant of permission is transferable.

\- Employee grants permission to employer to record as term of employment.

\- Employer grants Google blanket permission to record calls.

Do you know if it works that way?

Also, any business that is recording its employees _must_ notify customers.
This would be another thing they would have had to edit out of the call.

~~~
lern_too_spel
Your article says it works that way, so Google only has to ask permission from
the employer. The employer is the "person" who must grant permission to record
the call.

> Also, any business that is recording its employees must notify customers.

Google did the recording, so that doesn't apply. Even if it did, it would make
sense for Google to edit out the "Your call may be recorded for quality
recording purposes," in the demo.

------
threatofrain
This technology in general also raises a really modern question of what does
verbal consent mean now that a program can sound like me?

------
lifeformed
Couldn't they just have done the calls in a different state, where such a
recording is allowed?

------
spocklivelong
I look forward to a day when two bots talk to each other and just make things
happen.

------
ferongr
No.

------
chrischen
They heavily faked their Magic Leap demos.

------
yonkshi
Google AI hadn't published any major papers on ASR or TTS since WaveNet, even
the new WaveNet google demo'd a few months ago weren't even close to this
demo's voices. The use of "Uh Huh" "Hmmm" are called back channeling, it's
moon shot away given the current state of technology. It requires very low
latency and precise timing so it doesn't cut the speaker off.

I believe if this demo were real, it was tested and cherry picked from a very
particular environment

~~~
sixdimensional
This is a long shot but, do you think it is at all possible that they've had
some kind of a breakthrough using their quantum processing work, and are using
that to accelerate machine learning models and produce much better quality
generative speech? After hearing about Bristlecone I can't rule it out.

~~~
jessaustin
That would be a monumental case of burying the lede...

Surely if they had that computing power they would just be breaking crypto and
stealing money all day long?

