
The Forces Behind Toutiao, China’s Content King - anuh
http://blog.ycombinator.com/the-hidden-forces-behind-toutiao-chinas-content-king/
======
nobahashi
Hi Anu,

1.) Does the machine learning algorithm behind toutiao abide by the censorship
effort from Beijing? For example, are any of the contents about the
disappearance of the wife of the nobel prize winner Liu Xiaobao censored?

2.) What happens when the Chinese government demand board seats, much like
Tencent and Alibaba? Or is toutiao already working with the Chinese government
closely?

3.) Do you know whether or not toutiao is allowing this technology to be
outsourced to other dictatorship countries that would allow their leaders to
effectively control the contents people read every day?

4.) Are you aware that toutiao is spreading its influence into East Asia, and
possibly India, and is it possible that it would spread Chinese government's
message through its apps?

~~~
whooshee
Nobel Peace Prize is more political and ideological than any other Nobel
Prizes. I don't consider it to be a 100% valid one. Not like academic ones
where people send proofs. And that guy is named Liu XiaoBo not XiaoBao. His
wife is fine, and there are no reasons to do anything about his family.

Toutiao is just another company who runs for profits. Profits are focused. I
don't see its popularity outside China even in East Asia.

You may worry more about your own right-wing Indian government.

~~~
new299
"wife is fine"

Hu told BBC that three years of house arrest had thrown Liu into deep
depression and a health professional had been prescribing anti-depressants
medication for her. [1]

[http://www.bbc.com/news/world-asia-
china-25206137](http://www.bbc.com/news/world-asia-china-25206137)

------
flyingaragogue
First of all, I wouldn't consider Toutiao content to be "quality content", as
the article says. It's still pretty much all "soft news". Secondly, this 100%
engagement-driven algorism of content recommendation doesn't serve to inform
readers in any meaningful way, as a proper News app should do. But rather, it
just makes ppl keep clicking and wasting their time.

~~~
sanxiyn
There was a whole chain of separate departments dealing with proletarian
literature, music, drama, and entertainment generally. Here were produced
rubbishy newspapers containing almost nothing except sport, crime and
astrology, sensational five-cent novelettes, films oozing with sex, and
sentimental songs which were composed entirely by mechanical means on a
special kind of kaleidoscope known as a versificator.

\-- 1984, George Orwell

------
com2kid
This is what stuck out most for me:

> The engine learns quickly – for most users, it takes less than one day to
> successfully learn their interests (indicated by 80% read rates).

That is insane. I don't know of any other platform (except for maybe image
heavy subreddits?) that have that rate of content consumption. Facebook
obviously tries, but they are so focused on their limited content sources that
they frequently don't have the content people want. Twitter has similar
issues, if there aren't 50 tweets you want to read in a row, as you keep
scrolling Twitter has to drop the quality of content down to make sure the
feed shows something.

But by being willing to surface any type of content (give users what they
want), it sounds like Toutiao has been able to fully leverage AI, removing
constraints on what content exists naturally leads to better results.

Also, it is a miracle if they truly solved ad targeting. In comparison, I am
still getting ads for a belt I bought 3 weeks ago.

~~~
iflp
> That is insane.

No, it's not. It's just easier to learn to entertain the average people, than
to entertain the smaller group of more tasteful people, or to learn and sell
things.

I was also quite impressed when I learnt the figures, and am still impressed
at the fact that it managed to capture the mode of people's interest, but my
every attempt to use Toutiao has ended in pain. I certainly spent longer than
one day.

------
nullnilvoid
As an avid user of Toutiao, I am super impressed by Toutiao. It is super easy
to use and shows the content I am mostly interested in. I use it daily to get
information. I am a little bit worried that I would not get broad views if
Toutiao only shows articles I am interested in.

~~~
chenster
Fliboard's recommendation engine is really good, I would say it has already
solved the content discovery and recommendation piece of the puzzle. The key
is to keep us engaged early on and let them keep using the product.

The main differentiator is the Toutiao is able to leverage AI to write good
grammar article ... in Chinese when there aren't enough content already. The
China has SO MUCH data required for deep learning due to its population and
omnipresent of mobile devices. Other countries just couldn't compete.

------
joe_the_user
TL;DR; Does engagement equal success?

I have an account on Youtube that I used for only a specific interest.

Youtube's suggestions are pretty bad. And they've declined in quality. And
Youtube makes a strong effort to selling me mainstream content that I
uniformly reject. And I can find better recommendations on suggested videos
fitting my interest than I find on the Youtube landing for my account. And I'd
be for Youtube to keep a list of search terms and such for further viewing.

My deduction from this is that Youtube is more interested in pushing content
of other people's choice than giving me what I want. And that's pretty
standard for how all media has worked since the early days of radio.

So it seems like it would be easy for a content company to get engagement by
giving people what they want to view. And the challenge problem is often what
people want to view doesn't garner much payment per views. And so having a
bunch of engagement by itself doesn't give you success, rather what makes
money is careful balance of desired content and pushed-content (IE crap).

~~~
slackoverflower
Very much disagree. Youtube's suggestion algorithm are improved incredibly
over the years. It is increasingly surfacing content that I find interesting.
I spend much more time on Youtube nowadays, easy more time than Facebook. I
regularly just go to Youtube.com landing page to see what it wants to show me
and always shows great quality content. Youtube has done an amazing job with
their algorithms, and its showing.

Check out this article: [https://www.theverge.com/2017/8/30/16222850/youtube-
google-b...](https://www.theverge.com/2017/8/30/16222850/youtube-google-brain-
algorithm-video-recommendation-personalized-feed)

------
l5870uoo9y
How many "social features" does Toutiaco have in comparison to Facebook or
Twitter? I thought that was the essential feature to get user engagement, but
Toutiao blows both Facebook, Twitter, Instagram and Snapchat out of the water.

Toutiao stays true to Asian usability dogmas and opens all links in new tabs
(target="_blank")[1].

[1]: [http://www.toutiao.com/](http://www.toutiao.com/)

~~~
anuh
Good question. They did not have many social features at the start -
especially in year 1 (no followers, no friends, no chat). Over time they added
like, dislike, retweet, comments functionality.What made them really stand out
in the first year was the catchy title, delicate product design and pretty
much the first personalized news aggregator with recommendations in 2012. They
also iterated heavily to remain as one of the top apps in the app store during
the first year. In the first month - users gravitated towards the app as it
became the destination to read all top trending news /stories that were going
viral that day (it solved a gap in the market, no one else was doing it). Over
time they recommended stories to users and tracked data to ensure only
relevant stories were being pushed which helped increase engagement.

------
alangou
Toutiao is solving the Internet's current most pressing consumer-facing
problem - the discovery of relevant content.

It's a problem orthogonal to what Google solved earlier in the millenium,
which is roughly "How do I find an answer on the Internet to a question I
already have in mind?"

As the Internet has developed, the cost of content creation and data have
decreased dramatically. And so the sheer quantity of content has exploded.
Every consumer tech company has partially recognized this problem - Facebook,
Twitter, Quora, Instagram, Snapchat, and Medium all want to recommend you
relevant content, all have adopted feeds of articles and posts and pictures,
and all desperately want your continued attention.

At this point, there's likely good content on the Internet for every single
person's taste (like Anu said, there's a very long tail of content creators).
The service that provides the most relevant selection of it will win consumer
interest and usage at the largest scale.

And it certainly seems that Toutiao has leapfrogged everyone else in this
regard. I'm very excited (but also a little wary) to see what they come up
with in the future.

~~~
flachsechs
google absolutely sucks at it, probably because they don't see any value in it
given their user base (everyone) and lack of competition.

i.e. my youtube front page just shows the same 100 video links over and over
and over.

there isn't even a really good way to discover relevant content _in the
channels i 've already subscribed to_. it's pretty lame.

~~~
samstave
YouTube makes me want to punch my computer.

The related videos are never relevant, letting a child meander through YouTube
invariably winds up showing my kids super inappropriate content. The YouTube
mobile app sucks horribly (ios) I literally think that YouTube is one of the
worst ubiquitous services there is.

~~~
moron4hire
This isn't a great solution, because kids are pretty good at getting out and
into whatever app they want, but there is a YouTube Kids app. It's more like
streaming TV than YouTube.

------
indescions_2017
Toutiao is one of the names often bandied as a potential candidate for having
a US-based IPO, alongside Best, Roku, CarGurus, and some 100+ other Chinese
consumer internet unicorns ;)

Your insightful analysis of Toutiao's AI depth is enough to tempt one to dip a
toe should shares trade publicly. But the revenue growth (contextual ad based)
is phenomenal as well. Any clues as to how they might seek to expand into paid
content or media? Or otherwise grow beyond Chinese language centric markets?

Chinese startup Toutiao raising funds at over $20 billion valuation

[https://www.reuters.com/article/us-china-toutiao-
fundraising...](https://www.reuters.com/article/us-china-toutiao-
fundraising/chinese-startup-toutiao-raising-funds-at-over-20-billion-
valuation-sources-idUSKBN1AR0DE)

~~~
anuh
Toutiao is now focused on video as it is increasingly becoming more important
(more supply of video content and more consumption as well). Outside of
Chinese language centric markets they have an investment in India (which they
have announced publicly). India is definitely a market that could potentially
benefit from algorithm based content discovery (several different
states/languages and a long tail of content that is not easily available to
users).

------
whooshee
Huh, More time-killing than social media. Feeding people content that they
like. Putting them in the bubbles and reap the profits. Much of the Internet
has been on this track for years. Last year's election fully testified how
much all of this kind of media can do.

~~~
slackingoff2017
Reminds me of this [https://www.theawl.com/2015/06/a-complete-taxonomy-of-
intern...](https://www.theawl.com/2015/06/a-complete-taxonomy-of-internet-
chum/)

------
anuh
Hey! Anu here, a Partner at YC's Continuity Fund. Happy to chat about anything
discussed in this post and would love any feedback.

~~~
zmitri
I'm a little disappointed in this post given China's penchant to controlling
news/media/content.

I see how this could easily be manipulated into an extremely powerful
propaganda tool - especially hiding behind a guise of "algorithms" are doing
all the work - and yet it is not mentioned at all in the post.

We have already seen how content systems like Facebook, Google, and Twitter
can be manipulated via botnets, but when a central power manages this it can
obviously get significantly worse.

Does this concern you as an investor who lives in America?

~~~
anuh
I am afraid I cannot do justice to this question by answering it in a few
sentences in this forum. Toutiao is one of the first apps in China to enable
creators to write content and find audience without having to go through other
media outlets. So in a way they are transforming the way information is made
available to people. If you would like to discuss more email me directly at
anu@ycombinator.com

~~~
zmitri
I agree it's tough to do justice in a few sentences in a forum, which is why I
assumed it would be done in the very in depth and otherwise interesting
article you wrote :)

In my opinion it's the absence of this topic from the article that makes YC
look tone deaf once again by glossing over a serious repercussion that
technology can have on society for the sake of self-promotion -- especially
given current events regarding the manipulation of social networks by foreign
entities and a very serious history of censorship and media control.

------
NumberCruncher
>> The average user spends more than 74 minutes each day in Toutiao

so many wasted lives

~~~
paulsutter
I’d guess many of us spend 74 minutes a day on HN

------
bllguo
I'm sure there's research on this already, but just a thought - it seems to me
like it would be easier to do NLP on Chinese, vs English. (I speak both). Much
easier to recognize the author's meaning in Chinese characters, and more often
than not people use the same words to describe something.. vs. the frankly
disgusting mess that is English. Misspellings abound, the same words can mean
all kinds of things, grepping for strings is hard because sequences of letters
aren't unique, etc.

~~~
yorwba
Do you speak Chinese? NLP for Chinese comes with it's own set of challenges.

Because word boundaries aren't marked, you need word segmentation, which
requires understanding in lots of cases. (Can't tell where a word ends and the
next one begins if you don't know what they mean.)

Many words have a literal meaning and a metaphorical one (e.g. 纠结 can mean
both "tangled" and "confused").

Different synonyms being used for the same concept are common as well,
especially when you contrast formal vs. informal writing.

Misspellings can happen too, where a character is substituted with a similar
character that has the same pronunciation. Sometimes completely different
characters are used as a kind of pun.

Grepping is less likely to yield false positives (unless you're looking for a
single-character word that can appear in compounds), but there is no easy way
to do fuzzy matching to account for misspelled words.

------
thisisit
Hi Anu,

Most probably this might not get a reply but I still wanted to know about this
statement - "During the 2016 Olympics, a Toutiao bot wrote original news
coverage, publishing stories on major events more quickly than traditional
media outlets. The bot-written articles enjoyed read rates (# of reads and #
of impressions) in line with those produced at a slower speed and higher cost
by human writers on average."

1\. What does writing original news coverage entail? Did the bot put together
couple of news source to create an article or do you mean it transcribed the
news from the live video feed as it happened?

2\. How much of the bot feed works due to high declension? [1]. Because
normally bot structure and output tend to make less sense as the sentence
structure comes out all wrong.

[1]
[https://en.wikipedia.org/wiki/Declension](https://en.wikipedia.org/wiki/Declension)

Additionally, there is quite a lot of focus on getting user to stay on the
platform. Not to sound alarmist here but doesn't it mean you are most of the
time helping people reinforce their world views? News should be reported as is
but then if user engagement is so important news will be written right or left
wing just to keep the user on platform.

~~~
anuh
First, writing stories on Olympic game results required data, and Toutiao
pulled it from three sources: [a] real time score updates from the Olympics
organization, [b] images from an image-gathering-company it had recently
acquired to find relevant visual media, and [c] monitoring live text
commentary about the game. Second, Toutiao had to figure out how to combine
data from these three sources to ensure an internally consistent and relevant
story. As a result of these efforts they were able to publish news story of a
game approximately 2 seconds after the event ended as they collected and
processed the information as the game progressed. In Toutiao's case news is
the element with which they launched but today there is a lot more reason why
users use the app (much beyond news). Local news, weather, information on
agriculture, etc. are surfaced regularly which is why the app is quite popular
nationally in China and not just in Beijing and Shanghai.

------
ww520
AI just got a resurgence in the last couple years and it's interesting to see
how quickly they have adopted the technologies to build a company. Simple AI
techniques applied at key areas can generate huge result.

------
ck425
"While timing is everything for a startup, it takes deliberate effort to build
an addictive app"

Anyone else deeply uncomfortable with this? Not a useful app, an addictive
app. :/

------
justaschmoe
Off topic:

[http://www.toutiao.com/a6474835351272686094/#p=7](http://www.toutiao.com/a6474835351272686094/#p=7)

The (Beijing?) smog pictured here is somehow more poignant since the article
has 0 to do with pollution... just people living their lives. Scary.

~~~
bgee
That is Mountain Taijun, 7000+ feet peak with national AAAAA-rated tourist
attractions (and not in Beijing). So no, I don't think that's smog, probably
just fog.

~~~
bgee
Peak's Wikipedia page:
[https://zh.wikipedia.org/wiki/%E8%80%81%E5%90%9B%E5%B1%B1_(%...](https://zh.wikipedia.org/wiki/%E8%80%81%E5%90%9B%E5%B1%B1_\(%E6%A0%BE%E5%B7%9D\))

------
eddieplan9
Not to discount their achievements, but it's important to note that as the
content creator (instead of an aggregator or a content platform), it's much
easier to have control over the quality of metadata, which is immensely useful
in recommendation engines.

------
komali2
Checked out Toutiao's website, I think registering if you don't speak Chinese
is nearly impossible - google translate doesn't seem to work on the login
page.

You can copy/paste each string into google translate individually, if you're
patient enough ;)

~~~
anuh
It is a mobile first app :) I don’t think they want anyone going to their
website!

------
huac
> read rates (# of reads divided by # of impressions)

Why does Toutiao use this metric to measure quality of an article?

~~~
netheril96
Not a measure of quality, but a measure of potential profit.

------
huangc10
How can we trust any of the analytics coming out of mainland China? Can you be
100% positive that Toutiao is not controlled by the Chinese government? They
can easily funnel all internet traffic to Toutiao and control it, and use it
to their advantage. Curate the news to whatever way they want.

Toutiao is a privately owned company held in Beijing. Am I being paranoid?
They can claim machine learning (and they probably do). However, until we can
be 100% certain the Chinese government is not involved, I wouldn't personally
trust any of the stats.

~~~
anuh
I can’t convince you to be a believer :) That is totally up to you!

