Hacker News new | comments | show | ask | jobs | submit login
China's AI news anchors (inkstonenews.com)
93 points by alanwong 9 days ago | hide | past | web | favorite | 108 comments





The idea of a virtual anchor was pioneered >10 years ago by a group at the Northwestern CS Department including Kris Hammond. The project was called "News at Seven." This video shows the later iterations of it: https://www.youtube.com/watch?v=M3MRkFYM9Q4

IIRC the early version around 2005 or 2006 used the Half Life 2 engine, so you would have Alyx Vance reading the news with a gun strapped to her belt. At the time I remember thinking that this would be great when the customization options were enabled, as you could just get the news you want instead of the stuff you don't care about that takes up most newscasts today, as well as preferred visual and voice skins to handle the narration.

As someone who worked in broadcast news, I can say that most anchors do nothing more than reading from a script so this type of task is perfectly suited to replacement by a bot or AI or whatever you want to call it. Where it fails is the "banter" format that many U.S. local newscasts use and the ad hoc interviews that local and national anchors conduct with reporters in the field.


That was me! I came here to comment some version of "Ahem, we were doing this in 2005 at Northwestern" and seeing your comment to the same effect made me super happy. Thanks for noticing and remembering!

It was pretty cool! I always wondered why someone didn't try to take it further, i.e. commercialization. Was the tech just too limited at that time?

> The idea of a virtual anchor was pioneered >10 years ago

I can cite an earlier example, perhaps?

Back around '98 I was part of a 3d game project that had a dynamic news reader that moved his lips in accordance with a wav file being played. So, although 'not AI', it did do a realtime FFA on the waveform and 'guestimate' the shape of the mouth based on significant frequencies. Analysis of volume in the wav would trigger other movements (rocking backward), or increasing pitch might trigger eyebrow movements, etc. I recall that the 'wav' itself was stitched together from multiple wav snippets at runtime to give an accurate account of the game play that had just happened.

At the end of it all, it definitely satisfied the 'virtual anchor' as described here, and was all achieved with simple heuristics.

That said, I think I would've enjoyed being part of the virtual anchor team in the video.


Where it fails is the "banter"

Actually, where it fails is breaking news.

Former newsman here, too. And I remember staying on the air for 12, 24, 36 hours or more with no commercials during breaking news (hurricanes, flooding, etc...), and it's not something any AI will be up to in our lifetimes.


That's covered by the ad hoc interviews, which may be for breaking news or the live shot from a pending hurricane zone or something more mundane like a reporter outside the courtroom recapping a verdict from earlier in the day.

What could happen in these cases is there is some sort of "live desk" staffed by someone from the reporting staff when the need arises. Or, rethink the way breaking news is covered altogether ... does it really need a handoff back to an anchor?


does it really need a handoff back to an anchor?

Yes. While the anchor reads on-air, there are dozens of people behind the scenes doing other things. Some journalistic, some technical. All necessary and frenzied.

This isn't really the right medium to describe it all, though. During a serious news event, it can be quite the madhouse. I've never seen a Hollywood film that really captured it.


Wow text-to-speech doesn't seem to have gotten very far in the last 20 years? When we were kids we used to use a windows (3.1?) program for prank calls and it didn't sound much worse.. I would have thought by now basic things like word spacing and tones would have improved to the point where you would only realise its artificial occasionally.

It definitely doesn't use state of the art TTS. It sounds similar to MBrola english voice. There is google's [Wavenet](https://deepmind.com/blog/wavenet-generative-model-raw-audio...) and even chinese Baidus [Clarinet](https://clarinet-demo.github.io/) which are much better.

Cheers - yep google's especially sounds much more realistic. Obviously still awhile to go before a user would be tricked into thinking it's a real voice.

Wow text-to-speech doesn't seem to have gotten very far in the last 20 years

FWIW, I find that the macOS' Tom voice is remarkably good. I sometimes wonder if it's what NPR uses for some of its on-air messages, or if Apple's voice was just based on the same guy.


Not sure if they are using state-of-the-art TTS. This one [1] sounds a bit more natural.

[1] https://neospeech.com/



Try to enter a paragraph of text you didn't see before and close your eyes while it's "speaking". Do not read along.

It does not have a lot of obvious artifacts that plagued older TTSes, but it actually is remarkably hard to understand.


Kind of uncanny valley. The pronunciation of the words is good but there's no emphasis through the sentence so it sounds flat and kind of unsettling.

They should make it a university teacher. Clearly it's got the core skill down.

My first thought was towards Max Headroom. My second thought was on the Max Headroom incident.

If it went live, how long before a copy lands in the hands of pranksters?


Thinking on that... someone could use it to declare a revolution, or say claim a disaster has happened.

Once general-purpose GPUs get 100x faster (by 2025?), they'll get classified as weapons and one would need a strictly regulated license to do Deep Learning on them, with computations logged/snapshotted onto some governmental server.

Gamers will be forced to stream from cloud.


Can you name another common computer technology that has gotten so advanced it became classified as a weapon?

The notion that in six years general purpose GPUs will be strictly regulated as weapons, is pretty comical. It'll never happen.

In three or four years the improvemnet scaling will already be enough that whatever scenario you think must wait six years to justify strictly regulating GPU ownership, will already be possible. Absolutely nothing interesting is going to happen in the next six years that will require practically outlawing ownership of GPUs.

It's a non-issue for the next 10-15 years at a minimum. More likely it will never be an issue, because of the expertise that will be required. You can do nuclear technology research on your home computer, entirely legally. They do a pretty good job of keeping track of people capable of building nuclear tech. If AI becomes a similar risk, they'll do the same thing with the best minds in AI (ie track them). Very few AI developers will be able to build particularly dangerous AI applications, the counter thinking to that is fantasy based, a form of technology fear (which for 50+ years has failed to prove out time and time again; the technology fear ideology has failed so often and so spectacularly that it should be entirely discredited at this point).


Jensen Huang mentioned that he projects 1,000x GPU speed increase by 2025, so I am being conservative. If you also talk to NVidia people, the "rumors" are that Turing/Ampere will be pure gaming + inference cards and compute cards will no longer be sold individually. Moreover, recent pricing hike makes me think NVidia and game publishers are preparing a ground for streaming services, first an optional choice, then due to steeply growing individual GPU prices and lower TCO of subscriptions the only economical choice for a regular gamer. 1,000x speedup would definitely allow some advances things like Deep Fakes in near realtime; that could be extremely damaging to governments (we all could think of what fake but realistic coordinated nuke videos would cause, or a made up "leak" of world leaders with realistic audiovisual content), so a logical conclusion on my part is a strict regulation of such technology, that could indeed be classified as a psychological weapon of mass effectiveness.

> Can you name another common computer technology that has gotten so advanced it became classified as a weapon?

Cryptography export laws shows there's already precedent for this.


"with computations logged/snapshotted onto some governmental server."

If you have that one, then no regulated license will be necessary.


Or hasn't happened...

How's that different from AI-generated fake news videos, ala Jordan Peele invoking Obama?

it's not

> "AI news anchors"

They read from a script, poorly. The video of the anchor is generated using ML, but there is no "intelligence" here, not even artificial.

It's good propaganda for China, but not much more right now.


> They read from a script

"Real" or "human" news anchors don't? Asking because I really don't know, not to be pedantically poignant.


I think the user was referring to the AI bit. This "AI news anchor" is a glorified TTS.

Doesn't sound dissimilar from a news anchor made of meat.

Magic Leap's MICA was more impressive

https://youtu.be/QBd-egUFV_4?t=34


Does mica talk? That's the hard part right?

Watching the video, it’s pretty clear that this is an interesting stunt but would be considered unwatchable for any length of time.

I can imagine this working well in a couple years, but today it’s a long way off.


England created Max Headroom before America’s MTV took it over


Though Max Headroom is virtual in the “virtual YouTuber” or “virtual band” sense. ;)

I think it is fun stuff, and it was one of the plot points in Heinlen's "The Moon is a Harsh Mistress", but the execution is still a bit off.

It gets really wild when you can generatively create a movie just from the screenplay description and perhaps a set of storyboards. That will make the amount of crap video available explode uncontrollably.


Ananova (circa. April 2000) https://en.wikipedia.org/wiki/Ananova

Reminder that one chief editor of the People's daily commited suicide days ago, following several similar suicides.

A stressful job even to read propaganda afterall.


> Reminder that one chief editor of the People's daily commited suicide days ago, following several similar suicides.

> A stressful job even to read propaganda afterall.

Yeah, I've been told these news anchors are expected to not make a single mistake in their speech or presentation during the broadcasts, and that they're timed so precisely that you could set your watch by them.


> and that they're timed so precisely that you could set your watch by them.

While your local news may not adhere to an absolutely precise schedule, most broadcast mediums are scheduled to the second. When a program starts, the anchor has exactly 15 seconds to fill. Roll a minute of ads. Then they have 3:45 to fill, etc.

Watch CNN or listen to a nationally syndicated radio show, and time it. You'll see that the entire thing happens with sub-second precision.


Yep. This is the reason when I got out of broadcasting that my ulcers magically went away.

Another dangerous job we can protect humans from!

I can't wait for the exciting block chain anchors and the cloud anchors.

Oh how about the new buzzword anchor??



Two words: Drone anchor.

That'll be a fun etymology to unwind a century from now.

>>The English-speaking anchor, complete with a suit and tie, is modeled on a real-life Xinhua anchor called Zhang Zhao.

I hope that Zhang Zhao is getting a licensing stream for the use of his likeness. William Gibson was really ahead of his time when he popularized "synthetic personality constructs" as a mainstream literary trope[1].

[1] https://en.wikipedia.org/wiki/Idoru


Now looking back, I'd say the entire Bridge Trilogy was remarkably prescient. Especially so when you consider how SF / society has been transformed over the past couple of years. Worth a re-read.

I happen to think that how people react to news is determined to quite a large extent by the anchor's delivery. Stress and and intonation give clues about interpretation and expected response. Eg BBC newsreaders in the 1970s delivered in a very flat and non-opinionated style. Today it's full of their own "personality", ie opinion.

Worrying therefore that such stuff may be just a few config variables away from being centrally controlled.


Lets take it a step further. Your news feed will know who is watching it, will have a loads of data on your interests, beliefs and opinions and will cater the same news event specifically optimized for your consumption. Instead of having a handful of news organizations which have obvious biases one way or another to attract that particular audience, we'll have a unique experience for every individual.

I'm actually a bit surprised that print news hasn't gotten to this step yet. Just the injection, removal, or substitution of certain words can completely change the meta-message of the news article.

Add a central authority and reprogramming people is as easy, as you've mentioned, as a few config variables.


And another step? - one suggested in 1968 in John Brunner's 'Stand on Zanzibar' [0] which has the viewers inserted into the very programmes they are themselves watching. Truly an awesome book, and still amazingly fresh today.

[0] https://en.wikipedia.org/wiki/Stand_on_Zanzibar


As a matter of fact, it will be your own TV who makes the "anchor", no need to broadcast the whole thing.

Listening to the 9/11 broadcast, the first thing that tipped me off was the anchor's tone of voice. Discovery accident likewise.

Top 10 technologies that sound scary when in an article starting with 'China'.

China's track record with free speech makes the implications of completely bypassing any human 'journalist' kind of scary. I guess it's always been possible, but damn.

If it gets cheap enough, it will likely happen elsewhere. Onscreen talent is an expense that a lot of media companies would probably happily get rid of. It doesn't mean the reporters go away, but a universally and perpetually attractive, easily changeable, contract-free and salary-free anchor would be a bonus for any news company.

It could even spark new industry as companies arise to create avatars for other public-facing entities where a real person is not required, like corporate PR spokespersons.

Hollywood has already started down this road by creating virtual stand-ins for deceased actors - there's no reason to consider that they won't just create one from the whole cloth within a few years.

It might not happen, but don't be surprised if it does.


anchor != journalist

True - in this case though, it's one less human which could say 'no, I don't want to say that.'

I'm not confident that's an issue. The press usually takes the party line anyway. In any case where the press feels compelled to not take the party line (it's happened now and then), they've spoken out. I think if the press felt it necessary, they could do a "special" presentation when they feel necessary.

Honestly, I don't see a lot of difference between them and the way Sean Hannity blatantly takes the Trump line. The difference is that in the US, there's diversity of press and so diversity of opinions. I'm not confident there was ever diversity of press in China in the first place, so I'm not sure how much this changes. Maybe if the entire press cycle gets completely automated with no human oversight at all, OK. But I imagine there will always be human oversight, just because they don't want things to go off the rails. That's what happens right now.

And if we reach the singularity, I'm sure an AI would have more guts to stand up and say what's on their mind than a human would.


It would be very hard for anyone to watch most of Fox News and not see it as some form of state-controlled media in this administration.

I guess my brain immediately goes to the situation you describe with little/no human oversight. If the only layer of human oversight that does exist values control over truth, this is just one more layer of human intervention removed.


Lacking evidence for state-control, the most you could deductively claim is that there is ideological alignment and existing friendships with openly partisan pundits. A more appropriate label would be cronyism.

As an exercise in objective labeling - given Glenn Greenwald's reporting on the communications and friendships between Clinton staff and press outlets [1] - would all those those news outlets and non-pundit journalists immediately be considered state-controlled media had she been elected? I do not think so either.

[1] https://theintercept.com/2016/10/09/exclusive-new-email-leak...


> It would be very hard for anyone to watch most of Fox News and not see it as some form of state-controlled media in this administration.

I'm glad Americans have other, independent news sources. The problem with state-controlled media is not so much that it exists, but that when it does other options are often suppressed or eliminated in its favor.


Really, it would be hard, when ~50% of the country probably prefers fox and thinks CNN is the biased one? Seems awfully judgmental.

The only thing that would be on the mind of an AI is what is trained to be on its mind...

Isn't the main point of the singularity that AI would be able to break free of such bonds due to its superior intelligence?

> Isn't the main point of the singularity that AI would be able to break free of such bonds due to its superior intelligence?

At this point, the singularity is speculative fiction.


The original point was also only hypothetical.

I watched the embedded clip and the anchor pronounced "Jack Ma" as "Jack Massachusetts".

Reminds me of my google maps text to speech. I live near an airforce base and there is ARB in the exit name which it used pronounce as "ARB" instead of "A.R.B". They have since changed it to spell out the acronyms.

time please, sounds lolworthy

Around 0:45 on the tweeted video.

I predict people will less likely watch as a result them knowing the human isn’t real. Subconsciously some people want to see how the news anchor reacts, now this guy only has a single set of emotions and cannot give you the occasional off comment

Alibaba chairman Jack Massachusetts

One would expect them to go through their demo clip before sharing it with the world!


I've noticed a few different TTS programs do something similar. It leads to a weird uncanny valley when acronyms get substituted inappropriately. e.g. My address includes the Spanish word "del". One automated phone system from an insurance company read it back as "Delaware"

Someone still has to type in the text. So they might as well just read it out loud. I don't see the point.

Other systems can be put in place, that generate the video and the text.

The main takeaway is the realism of the news anchor through their implementation of the speech and lip moment synchronization - and how it may be a stepping stone towards a world where news delivery can be automated.


>Reminder that one chief editor of the People's daily commited suicide days ago, following several similar suicides.

>A stressful job even to read propaganda afterall I guess.

They even created a thing called "People's Search Engine" though it flopped.


Personally I could care less to listen to someone read the news. I can read much much faster. Talking heads seems so 20th century.

Sounds like an espeak voice synthesizer.

Throwing the term AI around is like the word Cyber was in the 1990s.

It would be more realistic if his/its jaw would move.

The rise of the machines as described by Thomas Ridd.

Sounds well behind the current best text-to-speech.

How is this AI?

The image and voice synthesis is powered by deep learning. The news anchor itself doesn't possess any intelligence.

i wonder what would happen if such an AI newscaster interviewed the president

Puppet anchors

That's actually pretty cool. Reminds me of those youtube videos full of automatically generated videos with TTS. This would be the next step :)

Also, anime girls.


Not at all challenging, considering the average CCTV (what a fitting name for a chinese tv station) anchor's range of emotional expression is limited to a knob.

Might as well just produce a deepfake anchor with Xi's ugly ass face instantly traumatizing millions of Chinese children, who knows how many falun gong infants were sacrificed to save the poor sick Chinese elite (they actually eat the fucking placenta for supposedly viagra effect jesus fucking christ im outta here never EVER gonna catch me going to China)


If you keep posting unsubstantive flamewar-style comments we'll ban the account. Please just post something thoughtful and informative that we can learn from instead.

https://news.ycombinator.com/newsguidelines.html


Please try to not be so polemic and disagreeable in your posts, and instead bring a spirit of respect here. Chinese people are human beings.

> Please try to not be so polemic and disagreeable in your posts, and instead bring a spirit of respect here.

Pointing out the crimes of the PRC is not disrespectful.

Except, perhaps, to the PRC.

> Chinese people are human beings.

Nobody's saying they're not.


> Except, perhaps, to the PRC.

or people who has been unwittingly duped/brainwashed into buying that "China will overtake USA anytime now" propaganda.

it's not limited to race, I've definitely seen some white American dude shilling for China...smh


> it's not limited to race

Very true.

https://www.bbc.com/news/uk-england-27307476

> The Royal College of Midwives (RCM) said there was not enough evidence for the organisation to "either support or not support" placentophagy as there had not been enough research on the health benefits.

On the one hand I didn't want to post this because it kinda grossed me out and might ruin someone's day or something -- on the other hand, I "totally could see it being a thing in Asia", just because of the other powdered animal stuff; but wasn't aware of it being a thing, period, and I guess that would describe many people. And some of the sentences in that article are really quite something, oh my.


I literally just threw up after finishing my avocado sandwich.

should add NSFL to that link.

im done with HN for the rest of the day.


I'm not understanding the relevance here, could you explain?

I already did? OP said they placenta eating is a thing in China, and

> I "totally could see it being a thing in Asia", just because of the other powdered animal stuff; but wasn't aware of it being a thing [everywhere], period, and I guess that would describe many people

so I shared my, uhh, information, instead of just letting that stereotype kinda linger.


I didn't see OPs original rant, it's been flagged. I'm assuming there was something in there about how awful Chinese people are because placenta?

As far as it being a thing.. I'm pretty sure historically it's a really big thing. Protein is expensive for subsistence farmers. Too bad OP lost their lunch over it


> I'm assuming there was something in there about how awful Chinese people are because placenta?

Exactly that.


PavlovsCat wrote a very generalizing statement about what "Asians" eat and then tried to smear it on me. It was a thinly veiled racism but then again people from UK doesn't seem to have a good grip on what is considered racist in North America. What he wrote would fall under sneaky bigotry. Not full blown racism but I'm used to seeing dogwhistling on reddit.

I only wrote that the corrupt Chinese elite exploit the people including taking prisoners organs, prostitution, all sorts of human rights abuse.

but somehow it's been derived to that removed comment which implied that I didn't view Mainland China unfavorably, please don't try to conflate it with sinophobia, I haven't made that case at all, it's just apalling behavior from the Communist party of China that pisses me off because it pains me to see what my Chinese friends go through.


> PavlovsCat wrote a very generalizing statement about what "Asians" eat

I said I considered it possible because of the powdered animals stuff [0]. That's not even a statement about "what Asians eat", much less a generalizing one.

[0] https://www.forbes.com/sites/jamesconca/2014/08/08/extinctio...

> It was a thinly veiled racism

I said I "totally could see it being a thing in Asia". Call that racism if you must, but there is nothing "veiled" here.

> I only wrote that the corrupt Chinese elite exploit the people including taking prisoners organs, prostitution, all sorts of human rights abuse

Yeah, and you put it like this

> the poor sick Chinese elite (they actually eat the fucking placenta for supposedly viagra effect jesus fucking christ im outta here never EVER gonna catch me going to China)

(anyone can turn on "showdead" and still see the comment, you know that, right?)

Sure, technically I guess "they" refers to the Chinese elite. Insofar I accept your correction. But still, way, I actually looked something up real quick that even you, who made the claim, apparently knew nothing about... and as I said, I only "shared what I found" because it refutes any stereotypes about only Asian or Chinese people doing that, heh.

Anyways, back to a vegan Alicia Silverstone eating her placenta; I find that by itself infinitely more fascinating than any of whatever this is.


Like I said, I didn't see your original comment. All I'll say is that I encourage you to listen to people who've lived in China about Chinese problems rather than just overlaying an American worldview.

Edit: oh nevermind, I can see the comment now. Shame on you.


Maybe you're so bought into an "us vs them" mentality that you equated fair-mindedness with "shilling".

I like the idea, this is very economically sane since it will allow the news company to deliver news all day without errors from human anchors. Although people would still have to write the script, I know the state will always put their best people to write 24/7.

Do you prefer watching an error-free (in that it doesn’t stumble) news segment delivered by a computer-visualized robot or an imperfect one by a human?

I mean personally, News Bloopers are the best kind of Bloopers. We laugh at our own faults. It's more human this way.

And it will have no issues delivering "fake news" :) /sarcasm



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: