Hacker News new | past | comments | ask | show | jobs | submit login
Does Tweeting Improve Citations? One-Year Results from a Randomized Trial (annalsthoracicsurgery.org)
96 points by XzetaU8 on June 14, 2020 | hide | past | favorite | 80 comments

The research community really should do a better job of engaging with the wider world, irrespective of whether it increases citations. For example, I had this paper in Hotnets 2019, and about 80 people saw my presentation. But I also spent some time, turned it into a little video and put it on YouTube. 360,000 people have seen it there. Now the subtleties were probably lost on many of those people, but if only a few of those people got something out of it, then it will likely have had more impact than the original paper.


I think the general trend of researchers becoming more media-savvy is great. But I also get the sense that researchers are increasingly choosing to work on Twitter-friendly topics (e.g., sensational, visually striking, reducible to sound bites) — whether that’s also positive I’m not so sure.

I wonder if there's an inverse relationship between the level of pressure on scientists to publish, tweet, blog and demonstrate the social relevance of their work, and groundbreaking advances in fundamental physics.

I don't think so. The culture of publish or perish is certainly detestable and I don't think 160 characters are enough for a synopsis. But I don't think it negatively affects research. At least I hope so.

Fundamental research and humanities both have these funding issues. It may be worth a discussion if these fields could benefit from increased public funding, because some forms of research don't result in an immediate economic benefit.

I'd be really surprised if it didn't have at least some negative impact, not so much because of the amount of funding but because of what you have to do these days to get the funding. The more time and energy you force academics to spend on journal/conference submissions, grant proposals, impact assessments, and public outreach, the less they have to spend on research. I'd expect it to be especially true for more abstract subjects as they're harder to distill into a concise form or to describe the social impact of, and often require protracted periods of intense study/thought which admin responsibilities distract from.

Add to that the number of distractions that everyone is subjected to in the modern world, and I start to feel like it's no wonder we're still struggling with quantum interpretations or going beyond the standard model.

Now, I'm not saying that we should reduce the number of conferences or papers, or amount of outreach. Just the amount of time academics need to spend on them. Of course there will always be a few of the elite who will find time to sit and think for hours per day outside of these, nevertheless the harder you make it the more knowledge will suffer.

There was an opinion piece on PNAS a few years ago that argued just that, no hard evidence though:


Thanks, that was a good read. Perelman is actually one of the examples I had at the back of my mind.

This is pretty much guaranteed to happen as explained by “the medium is the message”: https://en.m.wikipedia.org/wiki/The_medium_is_the_message

This reminds me of a story from Steven Levitt.

He published Freakonomics together with Stephen Dubner then went back into academia while Dubner went down the podcaster route.

15 years later, Dubner has reached millions of people with his work and Levitt recently started a podcast and soft quit academia after realizing that one of his best papers after years of research and hard work got 3 citations. He assumes fewer than 20 people will ever read that work.

I'm not sure why number of views/reads is the metric to optimize for, though. I'm sure more people listen to the Joe Rogan Experience than have read a Mathematical Theory of Communication. Does that intrinsically make it worth more?

Depends on your goals. Do you want to have impact or chase obscure brain teasers?

I gotta say A Mathematical Theory of Communication is not an obscure brain teaser, and has had quite an impact.... so your reply doesn't really reflect what the previous poster was asking about...

Size of impact is orthogonal to quality of impact, people who chase social stats seem to forget this fact.

Who said anything about social stats?

Levitt in particular says he feels like his research can have a bigger impact if it’s shared broadly with people than if it reaches 10 other academics and that’s it. We’re not talking about likes on social media, we’re talking about breadth of audience for academic research and how scientists can do better than chasing meaningless citation stats.

Levitt talks about his thought process in this wonderful episode of Freakonomics. https://freakonomics.com/podcast/math-curriculum/

> Who said anything about social stats?

You did.

> Dubner went went down the podcaster route. 15 years later, Dubner has reached millions of people [...]

I assume this is measured in terms of subscribers or listen counts.

I'm not saying the Freakonomics Stevens have/haven't had quality contributions to the world, I'm not even saying anything w.r.t. average podcast quality vs. average academic paper quality. I'm only saying that people often measure impact along only one dimension.

Levitt is leaving academia? Wow - that’s news!

Might be more of a sabbatical.

This is the Freakonomics episode where he talked about this https://freakonomics.com/podcast/math-curriculum/

Interesting. I will listen. He’s a distinguished service professor which is as high as you can get.

This sounds like an empirical side project rather than leaving academia, no? Has it been long enough to see the results?

Interesting that they put their hopes on data fluency on the College Board while the SATs are falling out of fashion.

Most research is not useful or suitable to the wider world.

Well, there are gradations to "the wider world". I don't know about you, but I read a lot of stuff that's way outside my area. That's why I hang out here. There's a huge gulf between "suitable for mainstream TV" and the opaque and overly compacted research paper. Places like New Scientist and Ars Technica fill this gap to some extent, but they tend to oversimplify to appeal to a wider audience. I'd rather hear it from the authors in an only slightly dumbed down form.

Same feeling here. That's why a friend of mine and I created Abstra. We ask authors to come and do some vulgarization themselves. The summaries are structured exactly like most papers: background, findings, and methods. Quick and to the point. There is an incentive for both as the reader gets quality info from the source and the researcher gains audience and exposure. We're at the prototype stage Check it out abstra.co.uk, would love to have your feedback!

Even as an ambitious and educated outsider, the stuff you're looking at is almost certainly not representative of most research in those respective fields. When weighted by pages, most research is boring and truly only of interest to the few dozen of specialized researchers in that subfield.

I spend a lot of time reading research papers in various fields that the original authors were not directing toward me as an imagined audience, and get plenty of value from papers that could be called “unsuitable to the wider world”.

The ability to use the web, citation indices, etc. to discover and read papers has made the whole of the scholarly literature much more useful to me than it would have been a few decades ago.

I guarantee a cottage industry of YouTubers would pop up digesting and rewrapping that content. Today, it's probably too inaccessible for them to do so.

Two Minute Papers is a great example of this:


Two minute papers is fantastic! I just wish that all of the papers he covers had source code available. A few of them I've gotten excited about and hopped around the net to read the paper only to find no code available.

This is the state of science in a lot fields unfortunately. Many reviewers skim the paper, put in a thinly veiled complaint about why their own paper was not cited, and don't bother with the code.

The code itself is always usually a mess: no readme, no license, random binaries, and a 'pipeline' which is simply a bunch of perl and python scripts cobbler together from snippets found on the web.

People publish, then move onto the next project. There is no maintenance.

In my case, I'm not a researcher or even a computer scientist (EE). I'm just a person with a curiosity in various topics. I find the language in papers to be a big barrier. Even cludgy bad code in ten different languages is more readable to me than the strange symbols and vague diagrams.

I agree with you that this is research and there's no reason for it to be pretty. The fact that it's unsightly and brittle makes a lot of people reticent about sharing it online since it could reflect on the quality of their non prototype code. Maintenance is a large burden and once the original goal of the paper has been proved, there's little benefit to the researcher to keep the code around, so I understand why they might say let's not release it at all.

A recent example of this was I was building a gesture classifier using an accelerometer to generate data. I decided do use an svm as based on what I had read, it would be the simplest to port to the microcontroller I was using. I was getting decent performance but needed to extract better features. I found several papers talking about the topic with some novel ideas and their impressive results. While it was nice to know that accuracy could get that high, none of the papers explained how the features they extracted were made. It was usually a one or two sentence line about using both time and frequency domain data.

> This is the state of science in a lot fields unfortunately. Many reviewers skim the paper, put in a thinly veiled complaint about why their own paper was not cited, and don't bother with the code.

Excuse me. Is this your actual experience or the meme that everybody is simply repeating?

In the admittedly few papers I have published, the reviewers’ feedback has been a net positive I wouldn’t be without.

Also, the general focus on code in science here on HN is so incredibly myopic and narrow minded. Code for research are not high-profile software projects. They are a piece in the toolbox used to solve a problem. Some grow to become larger projects, some fulfill their purpose and stay fixed in time.

Actual experience. The state of code in Bioinformatics is a joke. Many do not know the word 'conda' or 'docker', and example datasets are far and few between.

I sincerely doubt it. There is much more that is published than the general public has an appetite for. Even in a narrow scope, people have a limited appetite for any particular form of content before they're satiated. If too much content is produced for any particular topic, people will burn out and tune out.

Would you say the wider world is interested in watching a video on "Using ground relays for low-latency communication in Starlink"?

If Tom Scott or a similar personality did a video on it, yes.

If you present your research such as even a 5-year-old kid understands it then it will be suitable for the wider world.

That video is brilliant, by the way. I just watched it all the way through — it makes sense that a video that had such a high learning bandwidth (For lack of a better term) gets so many views.

What's interesting is that I didn't make that video for a general audience. I made it for the networking research community, so I didn't bother to explain what Dijkstra's algorithm is, or any of the basics really. There are also a number of graphs which conventional wisdom says should not be in a video for a general audience. Turns out we often underestimate the large niche technical-but-not-specialist audience which doesn't mind things not being dumbed down too much.

It's a matter of incentives, having made such a video you no doubt have a good understanding of the time required to produce this kind of content. Even publishing clean, reproducible code, which should be the absolute minimum, takes a lot of time and is absolutely not rewarded. So when researchers have a choice between publishing two papers with no supplementary material or publishing a single one that is well documented, cleanly coded, with a blog post, a video explanation, etc... it's no question. With the current incentive structures the choice is made for you.

That's a noble practice but I think everyone should stick at what he does best, for a better overall efficiency. As long as you make your paper easily available, as another commenter said, another person can make it more comprehensible for the general public.

One of the causes of the slowdown (even if marginal) of science and research in general is the administrative and social media overlays that were added to them.

A typical computer systems research project takes two years, though many take a lot more. After several years research, we write a paper like it's 1920, spend a day knocking together some powerpoint slides, and give a 20-minute presentation at some conference (and those are the successful projects). Science, whether pure or applied, is about pushing forward the bounds of knowledge and then, crucially, communicating those insights. There are so many papers published these days - one of my papers has been cited over 10,000 times - that's an average of 1.5 times a day for 19 years straight. Seems unlikely most of those papers ever get read. If you still believe in the work once it's published, you really should do your best to communicate your excitement to others. Unless you made a huge breakthough (unlikely in any field), no-one else is going to learn from your insights otherwise.

> As long as you make your paper easily available, as another commenter said, another person can make it more comprehensible for the general public.

I'm not sure you appreciate how hard this might be for someone who is not closely connected to a project.

The people who can do that reliably will have: a) the academic chops on a similar level of a peer reviewer b) social media skills. a) puts them back in the bracket of people you don't want to 'slow down'.

Well I agree but I see it like a pyramid. You don't have to worry about people at the bottom of the knowledge pyramid getting access to the last breakthrough in some field if they don't already know the basics about that field. They will naturally access it when they "climb" this pyramid of knowledge through more specialized sources or studies.

More over : the population of people who are good at doing something isn't necessarily the same as the one good at explaining it.

It's like saying Newton would have to spend 20% of his time explaining to 14yrs old kids what he's doing when someone less capable than him would do it as good but would leave Newton free for what he does best.

Same with MOOCs I think. It's too bad most of them are so bad they are not worth taking, but the ones that are good (e.g. Andrew Ng's machine learning MOOC) are having insane impact even though most people don't finish them.

The statistical analysis in this paper is really poor. Without the raw data I would say it is impossible to tell anything from this study.

> change in citations at 1 year (Tweeted +3.1±2.4 vs. Non-Tweeted +0.7±1.3, p<0.001)

I don't know what that p<0.001 means in this context but there is certainly more than a 0.1% chance that the null hypothesis (the two distributions are the same) given those 95% confidence intervals.

The graphs that are used to illustrate the paper have completely different confidence intervals than the numbers in the paper and look too good to be true.

Given the average tweet was engaged with less than 16 times I really doubt that the effect size would be this big. This looks to me to be the case that a couple of good papers happened to be tweeted and that nothing statistically significant can be gleaned from this paper.

Having said that I am not a professional statistician so I would appreciate the input of someone more knowledgeable than myself.

Yeah I agree this looks like one of those times where the researches aren't interested in doing the stats at all so just mail it in.

Just ignoring the stats though the results seem pretty solid. 3x the number of citations with ~100 observations. If the tweets were truly randomly assigned and there aren't big outliers driving results (big ifs, not gonna go deep enough to find out), their conclusion should be fine.

Perhaps ±2.4 is the estimated population standard deviation of the tweeted group, rather than a confidence interval. The confidence interval could be the smaller value you seein the charts.

Two distributions can overlap a lot and still have a low p-value for anova etc if there are enough data points.

This is actually very scary to me. This means that many papers citation list is not a result of a thorough literature search but rather the authors recalling papers they might have read or seen on social media. This is dangerous in that if there were papers from before The Tweet era that actually found that part of your research protocol was bad, they would not be found.

An example of this is the rat maze study that Feynman cites.


That last thing we need is clickbait science.

Unlike in the past, it’s impossible to read every paper in one’s field. Since tools like PubMed, ISI, and Scopus all use a proprietary algorithm to rank search results, I appreciate any alternative channels for bringing relevant papers to my attention.

It’s also perfectly justified to search for papers solely based on how influential they’ve been in your field. Papers which trend on Twitter (perhaps long-awaited results) are already having an impact on labs, probably being discussed in journal club and shaping the trajectory of the field.

PubMed is an interesting case, I wonder if the algorithm it uses for ranking would be FOIAable

> This means that many papers citation list is not a result of a thorough literature search

Related work sections aren't typically thorough, that's what lit surveys/reviews are for. It's also not a new phenomenon that highly cited work gets cited because it's highly cited, i.e. work is cited simply because it's more readily noticed. It's furthermore common for the same research to arise independently in different fields, with different terminology, i.e. bodies of research are somehow not found. So, no need to be scared, it's nothing new.

Unfortunately, I think we're already at a clickbait science level to some extent in some fields, depending on what you mean by "clickbait".

This effect would have to be replicated in some form to understand its boundaries better but if it were to generalize I think it has much broader implications than just Twitter and thoracic surgery.

There have been lots of bibliometric studies of citation impact etc. but this is one of the first times I've seen an actual experimental study in academics of this sort of thing, and it confirms what a lot of people have been experiencing, which is that actual scientific quality is only part of the reason why some work gets lots of attention. Replace twitter with other forms of social networking and the implications become clear.

There's ups and downs to this: maybe some underappreciated work would get more attention if it's marketed more, for example. But it speaks to the role of things outside of the domain of study per se (not sure what the term for this would be -- something like nondiagetic but for scientific content).

I think you're conflating several different purposes of citations.

A thorough literature search is mostly needed to avoid claiming as novel innovations/discoveries things that others have previously published (even if you more-or-less ended up discovering it independently). Secondarily, it also helps you avoid ratholes that others have stumbled into (although the reduced incentives for publishing negative results makes this less effective).

However, citations of papers encountered on — and recalled from — social media are more likely to be for the inspiration they provided for the line of inquiry or methodology in your paper (even if that inspiration was negative, ie. you're refuting someone's work or conclusions).

Some alternative explanations (likely a mix of them): papers that you see / think about influence the direction of your research; lit review is done early, but relevant papers you see in the process will be added; lit review is never going to be perfect, so more advertising = more chances of spotting something relevant.

Kill Elsevier and paywall then. They are the alternative but no one is going to pay 30$ per paper they need to read. If references were hyper links to the full pdf, they would get more views

I'm all for open access, but this experiment didn't change the accessibility of the papers.

I find it ironic that this is about citations and it costs $36 just to read. So if you wanted to cite this and somebody wanted to verify they would have to spend another $36.

Or they could use sci-hub

as an outsider, it's always fun to read the research paper drama.

Most authors put the preprint on their website for free. Almost every paper I try to read can be found on Google Scholar.

Except this one.

This is really frightening for anyone who works in these fields but isn't on any of the main social networks. I left facebook and co years ago and feel much better for it. I get that marketing matters, but marketing on a shitshow like twitter seems sanity challenging.

Twitter is what you make it to be. If you follow people that post a lot of (angry) political stuff and you get yourself involved and also tweet about it, then yes, things become quickly a shitshow. If you however look for people that talk about stuff you're interested in or of your own field, then you can get a lot of interesting conversations with people all over the world. Use the unfollow button and/or mute option for people that bring too much negativity to your feed and don't engage in negativity yourself.

The problem is that (last time I looked) you can't only look at people's tweets. You have to see people's retweets too. So most attempts to only see certain aspects of twitter are too hard or incredibly limiting.

With TweetDeck [1] you can select whether you want to include retweets or not [2], plus a lot of other useful options.

However, I think RTs enhance the experience a lot, as you get to discover other people with similar interest kind of organically. But the same rule applies to people who RT a lot of negativity heavy tweets, just unfollow of mute them.

If you're interested, I recently wrote a bit about how I've been using Twitter quite happily in the past few years [3].

[1] https://tweetdeck.twitter.com/ [2] https://i.imgur.com/0GOMZvg.png [3] https://dev.my-gate.net/2020/05/30/eight-years-on-twitter/

Thanks for the pointers. You've convinced me to try to give it another try. Any idea whether there's a way to hide retweets from specific people while allowing retweets from others? Like for people who tweet content on topics I care about and retweet content on topics I want to stay away from? Like, instead of muting someone, just muting their retweets.

I've not seen such a feature, but there are also a lot of third-party apps, so maybe one offers such a feature.

Cool. Thanks again for the tips :)

I find this incredibly unsurprising, if you follow journalists on twitter you'll hear the same thing over and over again. They say twitter is a disaster, every time you say anything slightly controversial you'll get dozens of random strangers attack you, and some will even permanently follow you insulting you until you block them. They also say they can't leave because it's one of the primary drivers of traffic to their articles and it would be impossible to achieve the metrics their employer requires without it.

It seems a strange property that twitter has developed of being both a terrible experience but also incredibly effective at organic advertising.

Ugh, this is so right. A lot of the trolls are also just new accounts the IRL troll makes to keep harassing. And don't forget the bots and the hordes of paid trolls employed by either companies or foreign governments to foment decisiveness. It reminds me of the Simpson's episode when the teachers go on strike and Bart runs through the angry mob yelling inflammatory and accusatory tripe.

Tweeting improves citations

Perhaps a "relevant" paper "The Kardashian index: a measure of discrepant social media profile for scientists" (https://genomebiology.biomedcentral.com/articles/10.1186/s13...)

This is just a joke paper, there is no statistical analysis in it and the key findings are: some scientists the author knows have more followers than others.

I'm currently building an alternative to twitter to publish summaries of papers. It's called Abstra. We're hoping to give researcher a better format to share their stuff, and a quick option for that to avoid wasting time. Would love to have your feedback! Abstra.co.uk

I'd love to see the paper (but don't want to pay $36 for it). Citations are often power-law distributed - I'd suspect one outliers is driving this effect.

Tl;Dr : yes. Immensely. Almost 9 times.

Great case study!!

Well, tweeting where?

If it is tweeted on an account with 0 followers, then certainly not, right?

And when tweeted to one millioin scientists in the same field, then certainly yes, right?

> 4 articles were prospectively tweeted per day by a designated TSSMN (Thoracic Surgery Social Media Network) delegate and retweeted by all other TSSMN delegates (n=11) with a combined followership of 52,893 individuals and @TSSMN for 14 days

Tweeting is science now.

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact