Hacker News new | past | comments | ask | show | jobs | submit login
Media's Obsession with Trump Not Supported by Data (parsely.com)
32 points by sak84 on May 26, 2016 | hide | past | favorite | 27 comments

The charts under the header "Trump Does Not Drive Revenue" seem to indicate 2.5x the number of articles can be written about him (vs Clinton) without dropping the number of page views significantly below other candidates.

So if the value of a page view is worth about the same for each candidate, Trump is driving more revenue, right?

I'm one of the authors of the article. Quick answer: The data shows that if you're a journalist about to write your next article, you're likely to get more views if you write it about Clinton rather than Trump. It's true that Trump got more pageviews overall, but that seems to be mostly because way more articles were written about him in the first place.

Long Answer: In the article we suggest that if publishers would have written more articles about Clinton, they would have received more page views, because in the data we observe posts on Clinton receive more page views on average. Similarly, we suggest writing more articles on Bernie Sanders would have caused an increase in referrals from social and search. As with all non-experimental approaches to causal inference, valid conclusions require strong assumptions. In the case of this analysis, we assume that the average number of page views that articles on a candidate receives is independent of the number of articles written on that candidate. If it were the case that writing more articles on Clinton, and fewer on Trump, would have caused Clinton articles to receive fewer views, and Trump articles to receive more, then our conclusions might be wrong.

I feel like you're working from a broken model.

Trump is the head, and the other candidates are the long tail.

You said: "It's true that Trump got more pageviews overall, but that seems to be mostly because way more articles were written about him in the first place."

And: "we suggest that if publishers would have written more articles about Clinton, they would have received more page views, because in the data we observe posts on Clinton receive more page views on average"

This seems to be the wrong conclusion because of diminishing returns. Writing more articles about Clinton should still push down the average page views. There is only so much interest, and only so much new to write about every day. None of the candidates can create fresh new controversies to feed the media the way Trump can. The question is how much would that push it down? I don't think it would be inaccurate to suggest, based on that these sites exist in a market, that it would push it down significantly below Trump.

I don't believe a base that strong exists per article, where any article is guaranteed to get some absolute number of page views. If diminishing returns aren't present or are extremely weak, then I'm wrong.

If anything, the data doesn't rule out that sites/reporters are correctly maximizing Trump coverage. Or they may not be maximizing enough since absolute demand is so high and Trump generates so much fresh content. If you can write about one easy topic, and maintain an average that high with only a small decrease in the average, you are doing more with less.

In the comment above, I try to clearly lay out that my conclusion rests on the assumption that for a given candidate, avg views per article and number of articles are independent.

I agree that if you reject this assumption, and instead assume that there are 'diminishing returns', then the conclusion I arrived at could be wrong.

There probably is some kind of diminishing return effect, but we don't know how strong it is. It could be weak compared the the effect that 'readers will consume whatever journalists write'. It's pretty interesting the all of the last four leading candidates (Trump, Cruz, Clinton, and Sanders) all had roughly comparable numbers for pageviews per post. That's evidence that readers just pretty much read whatever is published (with the exception of Kasich, a long shot).

It's also true that if you're a journalist right now, faced with the current distribution of articles, you're likely to get more page views by writing your next article on Clinton. This claim doesn't rest on any strong assumptions. That could change if many more articles on Clinton are written, but it's true for now.

If you look at the data in the dashboard, it's also interesting to see that Bernie Sanders gets way more social and search referrals compared to Clinton and Trump.

I think this assumption is dangerous to begin with. It should be proven with data that there are no diminishing returns, and that would be a powerful finding worthy of attention.

And saying that readers will consume whatever journalists write helps power the narrative that the media fueled Trump's campaign. They could write about a different candidate and get slightly improved pageviews, but they're choosing to flood with Trump articles. Your data would only conclude that pageviews aren't driving it.

I think expanding this to "readers will consume whatever journalists write" is a different argument and you would need to establish your "experiment" with a different methodology than the approach used here. The causation seems to be "reporters write news, it exists to be consumed on a site" therefore "readers read it" and that feels like it's missing something to me.

Also, it could be interesting they have comparable numbers per post, but it also backs up the idea that articles exist in response to the demand-supply feedback loop. If sites respond to pageviews, then candidates with lower average pageviews will simply not get as much media attention.

You bring up a good point. Journalists should explore to what extent there are diminishing returns to writing articles on other candidates. Statistics tells us that the most efficient way of exploring this hypothesis is with a multi-armed bandit algorithm. But before I go into that, I think it makes sense to break this problem down into two questions:

(1) Given equally interesting ideas for articles to write on each candidate, which candidate should a journalist write on?

(2) How much investment is required to write an interesting article on each candidate? It might require less work to write something interesting on Trump than on Clinton.

The right way of answering question (1) is with a dynamic multi-armed bandit algorithm. Such an algorithm dynamically explores the problem of diminishing returns. At this point, given the data we have, such an algorithm would suggest you should write on Clinton the vast majority of the time if you're interested in page views per article, and would suggest you write about Sanders if you're interested in bringing in external referrals from facebook and google. If journalists followed the advice of such an algorithm and wrote so many articles on Clinton that readers started to lose interest, then the algorithm would begin to suggest you write on someone else. If there's enough interest in this article, I might write up a follow-up where I fit a model that tells journalists what topic to write on, given that they it's just as easy to write an article on each topic. I could update this model every once in a while to make sure it detects those diminishing returns in time.

Question (2) is more difficult to answer and requires more domain knowledge. I would say it is possible at any moment to write hundreds of interesting articles on each candidate---the real question is how much work it takes. As I mention in the blog post, I am convinced that journalists find it easier to write interesting articles about Trump. So in some sense it's rational for them to do so: the 'return on investment' is higher because it's so cheap to churn out another article on Trump's latest soundbite. However, one could also argue that -- in the name of increased page views, or in the name of a functioning democracy -- they should make the extra effort to write an interesting article on the other candidates.

That seems to be what the data is saying. I think the article is trying to paint it differently and they don't actually understand the data.

Exactly. They ignored the law of diminishing return.


From the article:

The fact is that in the midst of today’s 24-7 news cycle, most journalists can devote only a small amount of time to their next article, and so they often find themselves choosing topics that are convenient to write about. Imagine you’re a journalist in front of a blank screen, thinking about your next story, and faced with intense pressure to pump out content. There may be no clear breaking news on Clinton, Sanders, Cruz, or Kasich — so writing about these candidates may require you to conduct research or reach out to voters.


I really think this is what it comes down to; this and the "access" issue, where reporters are scared of losing direct access to spokespeople and candidates who provide easy and ready-made stories and quotes. Without that, reporters are forced to, you know, report which they apparently don't have time for anymore.

Trump gets more page views per article than anyone but Clinton. From the article's graphs it looks like Trump generated more page views than all other candidates combined. Trump's behavior writes news articles by itself. If it takes 50% of the reporting effort to get the same number of clicks then I think the media's obsession is supported.

You're absolutely right and the Parsely article completely ignored this fact. There is probably some truth to 'the public reads whatever it is the media decides to write about', but they are completely ignoring the law of diminishing return. If someone artificially limited the number of articles about trump to not exceed the number of articles about Clinton, all the people who want to read about Trump would flood those articles. This would also deter journalists from writing all the low quality articles. The large quantity of low quality articles are also pulling down the average, hiding the fact that the good articles are getting much more page views. These are the articles that make other journalists want a piece of that pie.

That's not true. If you're looking at the interactive chart, you need to check the box that says "Show page views per article". Otherwise you're just seeing raw pageviews, and Trump has more of those because there were so many more articles written on him.

I think that's what @nimblegorilla was suggesting, and I think it actually fits with what the article was saying - the ROI on writing an article on Trump is much better than for other candidates because it's easy to write yet another article, and still get loads of pageviews for it.

In addition, Trump has significantly pushed up the total number of pageviews going to election cycle articles, even if the articles specifically about him are not as popular as the articles about Sanders, Clinton, Cruz, etc.

I'm not sure what you are disputing. The page views per article graph shows Trump within 10% of Clinton on a per article basis.

This article only continues to prove how little research is done before articles can be published.

The media is reporting on itself and its 'effects' without even taking the time to look at the data that is available through sites like Parse.ly and Google Analytics to see if their theories hold any water.

I was thinking the same thing. The sheer irony of this article proves what little thought goes into "news" these days.

When I look at the coverage, it seems like the news outlets are obsessed because Trump doesn't fit their world view. He caters for a demographic that those outlets have left behind. The page views aren't overly high because what he says isn't as outrageous as they think.

He's a self interested huckster.

The media has payed plenty of attention to him in the past.

> He's a self interested huckster.

You are describing candidates for office in general. Sanders may be an outlier, but the only one.

If the media loves writing articles about Trump, they could at least put a modicum of thought into topics. There are so many easy topics.

Far from Trump taking over the GOP, the GOP has taken over Trump and now literally dictates his policy.

Bankers contributed money and bought Trump's support for financial deregulation.

Trump energy and coal policy are literally being written by GOP governors. The GOP now coordinates with Trump's speak writer to determine what is said.

Starting a trade war with China would have a direct negative impact on real estate prices in the US, which would affect Trump...and therefore will never happen.

A Trump presidency would basically be characterized by substantial benefits accruing to the wealthy while his middle class and lower class supporters will be further subjugated. In other words, just write about the effects of his policy suggestions...which does not take much effort.

And perhaps the biggest one, instead of writing articles that Trump will somehow magically take over every American institution, the reality much more mundane. US institutions are strong and his ability to get anything done will be mired in the same red tape that every President faces. Thus, campaign promises to build walls, undo numerous trade deals, ban Muslims, deport Hispanics etc. are empty promises that simply will not occur.

I wonder if Trump fans don't read things on the Internet as much.

tl;dr fewer hits per Trump story, but more stories

A copywriting course taught me that many print and TV stories are based on prepared materials sent in by whoever wants coverage. [ It's not exactly "paid coverage", but "lower cost coverage" ] Since then, I've noticed many stories that (seem) obviously of this kind, because of the particular perspective and beneficiary.

Trump isn't preparing releases, but makes comments that are easy -- i.e. low cost -- to cover.

His hits per article are better described as similar to Clinton, Sanders and Cruz.

Article says fewer than Clinton, his main competitor, and is the gist of the article.

My point is that the difference between 9900 and 10200 is not particularly substantial. It's better summarized as "similar" than "fewer".

The differences with the other 2 are larger, but still not yuuge.

Whether you agree with the premise one way or the other, it seems like the data does not say what they think it says with as much certainty as they think. (I also think that solely blaming the media is problematic.)

Pageviews per article is not without flaws: if the media is writing more articles on Trump, then it will decrease. If a site has 10 articles on Trump, and 1 or 2 on the other candidates each day, and Trump's average view per article is similar then it's hard to conclude that there is a lack of interest.

Their graph shows Trump's average on par with the others, but he occupies 50% of the pie. That's huge. How is that categorized as "not driving traffic"?

It seems one assumption is sites could just write more articles about other candidates with the same engagement. If Hillary Clinton gets 6% more average clicks per article, it's hard to just pump out another article on Hillary if nothing new or novel has been said or done by her. Trump creates headlines and drums up controversy whenever he can in a way that most other candidates can't, because he's good at that. It's almost a symbiotic relationship with Trump and coverage of him.

Their idea of the definition of "driving revenue" is defined by the average clicks to read a Trump article. If your competitor has articles about Trump, and you don't, could you get away with that? Clearly people are reading articles about Trump here, so those views could disappear to other sites. (There is also the assumption that it's important that people click into the article, but many people will skim the headlines of a website.)

It also just compares coverage to other candidates, but the content driven revenue of a site is not just a basket of political candidates. It is relative to all other content as well. How much interest does Trump attract across the site?

Their chosen time period is November 2015 to May 2016. This ignores any jump start Trump might have received while his campaign was still nascent, relative to the many other contenders. The media coverage started way before November. You might say by November that Trump already was polling high enough that it was hard to ignore him. He announced his presidency in June 2015, and Trump has been consistently and heavily covered since. And Trump was already a celebrity and had been for years.

And this is why I was motivated to respond: "Many of the media companies we work with at Parse.ly encourage a data-driven culture that makes it easy for their employees to make informed content decisions. For example, anyone using Parse.ly can perform an analysis similar to the one we shared above."

I'm hoping "data-driven culture" in business doesn't become a synonymous with arbitrarily using data to make a point or disingenously undermine an argument. If there isn't discipline in application, then "data-driven" approaches will eventually gain a reputation as useless.

Not to be too cute, but I wonder how much traffic this post about Trump will drive to parsely.com relative to other blog posts.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact