
China's AI news anchors - alanwong
https://www.inkstonenews.com/tech/xinhua-and-sogou-show-news-anchors-powered-artificial-intelligence/article/2172460
======
ilamont
The idea of a virtual anchor was pioneered >10 years ago by a group at the
Northwestern CS Department including Kris Hammond. The project was called
"News at Seven." This video shows the later iterations of it:
[https://www.youtube.com/watch?v=M3MRkFYM9Q4](https://www.youtube.com/watch?v=M3MRkFYM9Q4)

IIRC the early version around 2005 or 2006 used the Half Life 2 engine, so you
would have Alyx Vance reading the news with a gun strapped to her belt. At the
time I remember thinking that this would be great when the customization
options were enabled, as you could just get the news you want instead of the
stuff you don't care about that takes up most newscasts today, as well as
preferred visual and voice skins to handle the narration.

As someone who worked in broadcast news, I can say that most anchors do
nothing more than reading from a script so this type of task is perfectly
suited to replacement by a bot or AI or whatever you want to call it. Where it
fails is the "banter" format that many U.S. local newscasts use and the ad hoc
interviews that local and national anchors conduct with reporters in the
field.

~~~
ndnichols
That was me! I came here to comment some version of "Ahem, we were doing this
in 2005 at Northwestern" and seeing your comment to the same effect made me
super happy. Thanks for noticing and remembering!

~~~
ilamont
It was pretty cool! I always wondered why someone didn't try to take it
further, i.e. commercialization. Was the tech just too limited at that time?

------
ian0
Wow text-to-speech doesn't seem to have gotten very far in the last 20 years?
When we were kids we used to use a windows (3.1?) program for prank calls and
it didn't sound much worse.. I would have thought by now basic things like
word spacing and tones would have improved to the point where you would only
realise its artificial occasionally.

~~~
MaBu
It definitely doesn't use state of the art TTS. It sounds similar to MBrola
english voice. There is google's [Wavenet]([https://deepmind.com/blog/wavenet-
generative-model-raw-audio...](https://deepmind.com/blog/wavenet-generative-
model-raw-audio/)) and even chinese Baidus [Clarinet]([https://clarinet-
demo.github.io/](https://clarinet-demo.github.io/)) which are much better.

~~~
ian0
Cheers - yep google's especially sounds much more realistic. Obviously still
awhile to go before a user would be tricked into thinking it's a real voice.

------
scj
My first thought was towards Max Headroom. My second thought was on the Max
Headroom incident.

If it went live, how long before a copy lands in the hands of pranksters?

~~~
lawlessone
Thinking on that... someone could use it to declare a revolution, or say claim
a disaster has happened.

~~~
bitL
Once general-purpose GPUs get 100x faster (by 2025?), they'll get classified
as weapons and one would need a strictly regulated license to do Deep Learning
on them, with computations logged/snapshotted onto some governmental server.

Gamers will be forced to stream from cloud.

~~~
adventured
Can you name another common computer technology that has gotten so advanced it
became classified as a weapon?

The notion that in six years general purpose GPUs will be strictly regulated
as weapons, is pretty comical. It'll never happen.

In three or four years the improvemnet scaling will already be enough that
whatever scenario you think must wait six years to justify strictly regulating
GPU ownership, will already be possible. Absolutely nothing interesting is
going to happen in the next six years that will require practically outlawing
ownership of GPUs.

It's a non-issue for the next 10-15 years at a minimum. More likely it will
never be an issue, because of the expertise that will be required. You can do
nuclear technology research on your home computer, entirely legally. They do a
pretty good job of keeping track of people capable of building nuclear tech.
If AI becomes a similar risk, they'll do the same thing with the best minds in
AI (ie track them). Very few AI developers will be able to build particularly
dangerous AI applications, the counter thinking to that is fantasy based, a
form of technology fear (which for 50+ years has failed to prove out time and
time again; the technology fear ideology has failed so often and so
spectacularly that it should be entirely discredited at this point).

~~~
bitL
Jensen Huang mentioned that he projects 1,000x GPU speed increase by 2025, so
I am being conservative. If you also talk to NVidia people, the "rumors" are
that Turing/Ampere will be pure gaming + inference cards and compute cards
will no longer be sold individually. Moreover, recent pricing hike makes me
think NVidia and game publishers are preparing a ground for streaming
services, first an optional choice, then due to steeply growing individual GPU
prices and lower TCO of subscriptions the only economical choice for a regular
gamer. 1,000x speedup would definitely allow some advances things like Deep
Fakes in near realtime; that could be extremely damaging to governments (we
all could think of what fake but realistic coordinated nuke videos would
cause, or a made up "leak" of world leaders with realistic audiovisual
content), so a logical conclusion on my part is a strict regulation of such
technology, that could indeed be classified as a psychological weapon of mass
effectiveness.

------
SrslyJosh
> "AI news anchors"

They read from a script, poorly. The video of the anchor is generated using
ML, but there is no "intelligence" here, not even artificial.

It's good propaganda for China, but not much more right now.

~~~
pvarangot
> They read from a script

"Real" or "human" news anchors don't? Asking because I really don't know, not
to be pedantically poignant.

~~~
tomcooks
I think the user was referring to the AI bit. This "AI news anchor" is a
glorified TTS.

------
achow
Magic Leap's MICA was more impressive

[https://youtu.be/QBd-egUFV_4?t=34](https://youtu.be/QBd-egUFV_4?t=34)

~~~
alexnewman
Does mica talk? That's the hard part right?

------
iambateman
Watching the video, it’s pretty clear that this is an interesting stunt but
would be considered unwatchable for any length of time.

I can imagine this working well in a couple years, but today it’s a long way
off.

------
nobrains
Ananova (circa. April 2000)
[https://en.wikipedia.org/wiki/Ananova](https://en.wikipedia.org/wiki/Ananova)

------
sbhn
England created Max Headroom before America’s MTV took it over

~~~
eskaytwo
And then he took over
[https://en.wikipedia.org/wiki/Max_Headroom_broadcast_signal_...](https://en.wikipedia.org/wiki/Max_Headroom_broadcast_signal_intrusion)

------
ChuckMcM
I think it is fun stuff, and it was one of the plot points in Heinlen's "The
Moon is a Harsh Mistress", but the execution is still a bit off.

It gets really wild when you can generatively create a movie just from the
screenplay description and perhaps a set of storyboards. That will make the
amount of crap video available explode uncontrollably.

------
mindfulplay
I can't wait for the exciting block chain anchors and the cloud anchors.

Oh how about the new buzzword anchor??

~~~
AWildC182
Two words: Drone anchor.

~~~
dredmorbius
That'll be a fun etymology to unwind a century from now.

------
otoburb
>> _The English-speaking anchor, complete with a suit and tie, is modeled on a
real-life Xinhua anchor called Zhang Zhao._

I hope that Zhang Zhao is getting a licensing stream for the use of his
likeness. William Gibson was really ahead of his time when he popularized
"synthetic personality constructs" as a mainstream literary trope[1].

[1] [https://en.wikipedia.org/wiki/Idoru](https://en.wikipedia.org/wiki/Idoru)

~~~
stuxnet79
Now looking back, I'd say the entire Bridge Trilogy was remarkably prescient.
Especially so when you consider how SF / society has been transformed over the
past couple of years. Worth a re-read.

------
kitd
I happen to think that how people react to news is determined to quite a large
extent by the anchor's delivery. Stress and and intonation give clues about
interpretation and expected response. Eg BBC newsreaders in the 1970s
delivered in a very flat and non-opinionated style. Today it's full of their
own "personality", ie opinion.

Worrying therefore that such stuff may be just a few config variables away
from being centrally controlled.

~~~
nemo44x
Lets take it a step further. Your news feed will know who is watching it, will
have a loads of data on your interests, beliefs and opinions and will cater
the same news event specifically optimized for your consumption. Instead of
having a handful of news organizations which have obvious biases one way or
another to attract that particular audience, we'll have a unique experience
for every individual.

I'm actually a bit surprised that print news hasn't gotten to this step yet.
Just the injection, removal, or substitution of certain words can completely
change the meta-message of the news article.

Add a central authority and reprogramming people is as easy, as you've
mentioned, as a few config variables.

~~~
KineticLensman
And another step? - one suggested in 1968 in John Brunner's 'Stand on
Zanzibar' [0] which has the viewers inserted into the very programmes they are
themselves watching. Truly an awesome book, and still amazingly fresh today.

[0]
[https://en.wikipedia.org/wiki/Stand_on_Zanzibar](https://en.wikipedia.org/wiki/Stand_on_Zanzibar)

------
wufufufu
Top 10 technologies that sound scary when in an article starting with 'China'.

------
bananatron
China's track record with free speech makes the implications of completely
bypassing any human 'journalist' kind of scary. I guess it's always been
possible, but damn.

~~~
PakG1
I'm not confident that's an issue. The press usually takes the party line
anyway. In any case where the press feels compelled to not take the party line
(it's happened now and then), they've spoken out. I think if the press felt it
necessary, they could do a "special" presentation when they feel necessary.

Honestly, I don't see a lot of difference between them and the way Sean
Hannity blatantly takes the Trump line. The difference is that in the US,
there's diversity of press and so diversity of opinions. I'm not confident
there was ever diversity of press in China in the first place, so I'm not sure
how much this changes. Maybe if the entire press cycle gets completely
automated with no human oversight at all, OK. But I imagine there will always
be human oversight, just because they don't want things to go off the rails.
That's what happens right now.

And if we reach the singularity, I'm sure an AI would have more guts to stand
up and say what's on their mind than a human would.

~~~
bananatron
It would be very hard for anyone to watch most of Fox News and not see it as
some form of state-controlled media in this administration.

I guess my brain immediately goes to the situation you describe with little/no
human oversight. If the only layer of human oversight that does exist values
control over truth, this is just one more layer of human intervention removed.

~~~
drak0n1c
Lacking evidence for state-control, the most you could deductively claim is
that there is ideological alignment and existing friendships with openly
partisan pundits. A more appropriate label would be cronyism.

As an exercise in objective labeling - given Glenn Greenwald's reporting on
the communications and friendships between Clinton staff and press outlets [1]
- would all those those news outlets and non-pundit journalists immediately be
considered state-controlled media had she been elected? I do not think so
either.

[1] [https://theintercept.com/2016/10/09/exclusive-new-email-
leak...](https://theintercept.com/2016/10/09/exclusive-new-email-leak-reveals-
clinton-campaigns-cozy-press-relationship/)

------
ablation
I watched the embedded clip and the anchor pronounced "Jack Ma" as "Jack
Massachusetts".

~~~
bradgnar
time please, sounds lolworthy

~~~
hiharryhere
Around 0:45 on the tweeted video.

------
m3kw9
I predict people will less likely watch as a result them knowing the human
isn’t real. Subconsciously some people want to see how the news anchor reacts,
now this guy only has a single set of emotions and cannot give you the
occasional off comment

------
amelius
Someone still has to type in the text. So they might as well just read it out
loud. I don't see the point.

~~~
doodhwala
Other systems can be put in place, that generate the video and the text.

The main takeaway is the realism of the news anchor through their
implementation of the speech and lip moment synchronization - and how it may
be a stepping stone towards a world where news delivery can be automated.

------
doodhwala
_Alibaba chairman Jack Massachusetts_

One would expect them to go through their demo clip before sharing it with the
world!

~~~
Rebelgecko
I've noticed a few different TTS programs do something similar. It leads to a
weird uncanny valley when acronyms get substituted inappropriately. e.g. My
address includes the Spanish word "del". One automated phone system from an
insurance company read it back as "Delaware"

------
coldcode
Personally I could care less to listen to someone read the news. I can read
much much faster. Talking heads seems so 20th century.

------
rhema
Sounds like an espeak voice synthesizer.

------
dghughes
Throwing the term AI around is like the word Cyber was in the 1990s.

------
sebazzz
It would be more realistic if his/its jaw would move.

------
bob_paulson
The rise of the machines as described by Thomas Ridd.

------
mooneater
Sounds well behind the current best text-to-speech.

------
gophicer
How is this AI?

~~~
Tarq0n
The image and voice synthesis is powered by deep learning. The news anchor
itself doesn't possess any intelligence.

------
HillaryBriss
i wonder what would happen if such an AI newscaster interviewed the president

------
ohiovr
Puppet anchors

------
cauldron
Reminder that one chief editor of the People's daily commited suicide days
ago, following several similar suicides.

A stressful job even to read propaganda afterall.

~~~
eiaoa
> Reminder that one chief editor of the People's daily commited suicide days
> ago, following several similar suicides.

> A stressful job even to read propaganda afterall.

Yeah, I've been told these news anchors are expected to not make a single
mistake in their speech or presentation during the broadcasts, and that
they're timed so precisely that you could set your watch by them.

~~~
all2
> and that they're timed so precisely that you could set your watch by them.

While your local news may not adhere to an absolutely precise schedule, most
broadcast mediums are scheduled to the second. When a program starts, the
anchor has exactly 15 seconds to fill. Roll a minute of ads. Then they have
3:45 to fill, etc.

Watch CNN or listen to a nationally syndicated radio show, and time it. You'll
see that the entire thing happens with sub-second precision.

~~~
reaperducer
Yep. This is the reason when I got out of broadcasting that my ulcers
magically went away.

------
cauldron
>Reminder that one chief editor of the People's daily commited suicide days
ago, following several similar suicides.

>A stressful job even to read propaganda afterall I guess.

They even created a thing called "People's Search Engine" though it flopped.

------
nhauz
That's actually pretty cool. Reminds me of those youtube videos full of
automatically generated videos with TTS. This would be the next step :)

Also, anime girls.

------
browsercoin
Not at all challenging, considering the average CCTV (what a fitting name for
a chinese tv station) anchor's range of emotional expression is limited to a
knob.

Might as well just produce a deepfake anchor with Xi's ugly ass face instantly
traumatizing millions of Chinese children, who knows how many falun gong
infants were sacrificed to save the poor sick Chinese elite (they actually eat
the fucking placenta for supposedly viagra effect jesus fucking christ im
outta here never EVER gonna catch me going to China)

~~~
i_am_nomad
Please try to not be so polemic and disagreeable in your posts, and instead
bring a spirit of respect here. Chinese people are human beings.

~~~
msla
> Please try to not be so polemic and disagreeable in your posts, and instead
> bring a spirit of respect here.

Pointing out the crimes of the PRC is not disrespectful.

Except, perhaps, to the PRC.

> Chinese people are human beings.

Nobody's saying they're not.

~~~
browsercoin
> Except, perhaps, to the PRC.

or people who has been unwittingly duped/brainwashed into buying that "China
will overtake USA anytime now" propaganda.

it's not limited to race, I've definitely seen some white American dude
shilling for China...smh

~~~
PavlovsCat
> it's not limited to race

Very true.

[https://www.bbc.com/news/uk-england-27307476](https://www.bbc.com/news/uk-
england-27307476)

> The Royal College of Midwives (RCM) said there was not enough evidence for
> the organisation to "either support or not support" placentophagy as there
> had not been enough research on the health benefits.

On the one hand I didn't want to post this because it kinda grossed me out and
might ruin someone's day or something -- on the other hand, I "totally could
see it being a thing in Asia", just because of the other powdered animal
stuff; but wasn't aware of it being a thing, period, and I guess that would
describe many people. And some of the sentences in that article are really
quite something, oh my.

~~~
browsercoin
I literally just threw up after finishing my avocado sandwich.

should add NSFL to that link.

im done with HN for the rest of the day.

------
NotAmazin
I like the idea, this is very economically sane since it will allow the news
company to deliver news all day without errors from human anchors. Although
people would still have to write the script, I know the state will always put
their best people to write 24/7.

~~~
alanwong
Do you prefer watching an error-free (in that it doesn’t stumble) news segment
delivered by a computer-visualized robot or an imperfect one by a human?

~~~
tehaugmenter
I mean personally, News Bloopers are the best kind of Bloopers. We laugh at
our own faults. It's more human this way.

