So if the value of a page view is worth about the same for each candidate, Trump is driving more revenue, right?
Long Answer: In the article we suggest that if publishers would have written more articles about Clinton, they would have received more page views, because in the data we observe posts on Clinton receive more page views on average. Similarly, we suggest writing more articles on Bernie Sanders would have caused an increase in referrals from social and search. As with all non-experimental approaches to causal inference, valid conclusions require strong assumptions. In the case of this analysis, we assume that the average number of page views that articles on a candidate receives is independent of the number of articles written on that candidate. If it were the case that writing more articles on Clinton, and fewer on Trump, would have caused Clinton articles to receive fewer views, and Trump articles to receive more, then our conclusions might be wrong.
Trump is the head, and the other candidates are the long tail.
You said: "It's true that Trump got more pageviews overall, but that seems to be mostly because way more articles were written about him in the first place."
And: "we suggest that if publishers would have written more articles about Clinton, they would have received more page views, because in the data we observe posts on Clinton receive more page views on average"
This seems to be the wrong conclusion because of diminishing returns. Writing more articles about Clinton should still push down the average page views. There is only so much interest, and only so much new to write about every day. None of the candidates can create fresh new controversies to feed the media the way Trump can. The question is how much would that push it down? I don't think it would be inaccurate to suggest, based on that these sites exist in a market, that it would push it down significantly below Trump.
I don't believe a base that strong exists per article, where any article is guaranteed to get some absolute number of page views. If diminishing returns aren't present or are extremely weak, then I'm wrong.
If anything, the data doesn't rule out that sites/reporters are correctly maximizing Trump coverage. Or they may not be maximizing enough since absolute demand is so high and Trump generates so much fresh content. If you can write about one easy topic, and maintain an average that high with only a small decrease in the average, you are doing more with less.
I agree that if you reject this assumption, and instead assume that there are 'diminishing returns', then the conclusion I arrived at could be wrong.
There probably is some kind of diminishing return effect, but we don't know how strong it is. It could be weak compared the the effect that 'readers will consume whatever journalists write'. It's pretty interesting the all of the last four leading candidates (Trump, Cruz, Clinton, and Sanders) all had roughly comparable numbers for pageviews per post. That's evidence that readers just pretty much read whatever is published (with the exception of Kasich, a long shot).
It's also true that if you're a journalist right now, faced with the current distribution of articles, you're likely to get more page views by writing your next article on Clinton. This claim doesn't rest on any strong assumptions. That could change if many more articles on Clinton are written, but it's true for now.
If you look at the data in the dashboard, it's also interesting to see that Bernie Sanders gets way more social and search referrals compared to Clinton and Trump.
And saying that readers will consume whatever journalists write helps power the narrative that the media fueled Trump's campaign. They could write about a different candidate and get slightly improved pageviews, but they're choosing to flood with Trump articles. Your data would only conclude that pageviews aren't driving it.
I think expanding this to "readers will consume whatever journalists write" is a different argument and you would need to establish your "experiment" with a different methodology than the approach used here. The causation seems to be "reporters write news, it exists to be consumed on a site" therefore "readers read it" and that feels like it's missing something to me.
Also, it could be interesting they have comparable numbers per post, but it also backs up the idea that articles exist in response to the demand-supply feedback loop. If sites respond to pageviews, then candidates with lower average pageviews will simply not get as much media attention.
(1) Given equally interesting ideas for articles to write on each candidate, which candidate should a journalist write on?
(2) How much investment is required to write an interesting article on each candidate? It might require less work to write something interesting on Trump than on Clinton.
The right way of answering question (1) is with a dynamic multi-armed bandit algorithm. Such an algorithm dynamically explores the problem of diminishing returns. At this point, given the data we have, such an algorithm would suggest you should write on Clinton the vast majority of the time if you're interested in page views per article, and would suggest you write about Sanders if you're interested in bringing in external referrals from facebook and google. If journalists followed the advice of such an algorithm and wrote so many articles on Clinton that readers started to lose interest, then the algorithm would begin to suggest you write on someone else. If there's enough interest in this article, I might write up a follow-up where I fit a model that tells journalists what topic to write on, given that they it's just as easy to write an article on each topic. I could update this model every once in a while to make sure it detects those diminishing returns in time.
Question (2) is more difficult to answer and requires more domain knowledge. I would say it is possible at any moment to write hundreds of interesting articles on each candidate---the real question is how much work it takes. As I mention in the blog post, I am convinced that journalists find it easier to write interesting articles about Trump. So in some sense it's rational for them to do so: the 'return on investment' is higher because it's so cheap to churn out another article on Trump's latest soundbite. However, one could also argue that -- in the name of increased page views, or in the name of a functioning democracy -- they should make the extra effort to write an interesting article on the other candidates.
The fact is that in the midst of today’s 24-7 news cycle, most journalists can devote only a small amount of time to their next article, and so they often find themselves choosing topics that are convenient to write about. Imagine you’re a journalist in front of a blank screen, thinking about your next story, and faced with intense pressure to pump out content. There may be no clear breaking news on Clinton, Sanders, Cruz, or Kasich — so writing about these candidates may require you to conduct research or reach out to voters.
I really think this is what it comes down to; this and the "access" issue, where reporters are scared of losing direct access to spokespeople and candidates who provide easy and ready-made stories and quotes. Without that, reporters are forced to, you know, report which they apparently don't have time for anymore.
In addition, Trump has significantly pushed up the total number of pageviews going to election cycle articles, even if the articles specifically about him are not as popular as the articles about Sanders, Clinton, Cruz, etc.
The media is reporting on itself and its 'effects' without even taking the time to look at the data that is available through sites like Parse.ly and Google Analytics to see if their theories hold any water.
The media has payed plenty of attention to him in the past.
You are describing candidates for office in general. Sanders may be an outlier, but the only one.
Far from Trump taking over the GOP, the GOP has taken over Trump and now literally dictates his policy.
Bankers contributed money and bought Trump's support for financial deregulation.
Trump energy and coal policy are literally being written by GOP governors. The GOP now coordinates with Trump's speak writer to determine what is said.
Starting a trade war with China would have a direct negative impact on real estate prices in the US, which would affect Trump...and therefore will never happen.
A Trump presidency would basically be characterized by substantial benefits accruing to the wealthy while his middle class and lower class supporters will be further subjugated. In other words, just write about the effects of his policy suggestions...which does not take much effort.
And perhaps the biggest one, instead of writing articles that Trump will somehow magically take over every American institution, the reality much more mundane. US institutions are strong and his ability to get anything done will be mired in the same red tape that every President faces. Thus, campaign promises to build walls, undo numerous trade deals, ban Muslims, deport Hispanics etc. are empty promises that simply will not occur.
A copywriting course taught me that many print and TV stories are based on prepared materials sent in by whoever wants coverage. [ It's not exactly "paid coverage", but "lower cost coverage" ] Since then, I've noticed many stories that (seem) obviously of this kind, because of the particular perspective and beneficiary.
Trump isn't preparing releases, but makes comments that are easy -- i.e. low cost -- to cover.
The differences with the other 2 are larger, but still not yuuge.
Pageviews per article is not without flaws: if the media is writing more articles on Trump, then it will decrease. If a site has 10 articles on Trump, and 1 or 2 on the other candidates each day, and Trump's average view per article is similar then it's hard to conclude that there is a lack of interest.
Their graph shows Trump's average on par with the others, but he occupies 50% of the pie. That's huge. How is that categorized as "not driving traffic"?
It seems one assumption is sites could just write more articles about other candidates with the same engagement. If Hillary Clinton gets 6% more average clicks per article, it's hard to just pump out another article on Hillary if nothing new or novel has been said or done by her. Trump creates headlines and drums up controversy whenever he can in a way that most other candidates can't, because he's good at that. It's almost a symbiotic relationship with Trump and coverage of him.
Their idea of the definition of "driving revenue" is defined by the average clicks to read a Trump article. If your competitor has articles about Trump, and you don't, could you get away with that? Clearly people are reading articles about Trump here, so those views could disappear to other sites. (There is also the assumption that it's important that people click into the article, but many people will skim the headlines of a website.)
It also just compares coverage to other candidates, but the content driven revenue of a site is not just a basket of political candidates. It is relative to all other content as well. How much interest does Trump attract across the site?
Their chosen time period is November 2015 to May 2016. This ignores any jump start Trump might have received while his campaign was still nascent, relative to the many other contenders. The media coverage started way before November. You might say by November that Trump already was polling high enough that it was hard to ignore him. He announced his presidency in June 2015, and Trump has been consistently and heavily covered since. And Trump was already a celebrity and had been for years.
And this is why I was motivated to respond: "Many of the media companies we work with at Parse.ly encourage a data-driven culture that makes it easy for their employees to make informed content decisions. For example, anyone using Parse.ly can perform an analysis similar to the one we shared above."
I'm hoping "data-driven culture" in business doesn't become a synonymous with arbitrarily using data to make a point or disingenously undermine an argument. If there isn't discipline in application, then "data-driven" approaches will eventually gain a reputation as useless.
Not to be too cute, but I wonder how much traffic this post about Trump will drive to parsely.com relative to other blog posts.