Reasons blog posts can be of higher scientific quality than journal articles

ChicagoBoy11 · on April 17, 2017

I'm surprised that so many of the comments seem to attack the claims in the post along the dimension of "well, anyone can write a blog post." I think as a community HN would staunchly advocate that the recipe for building great companies or software is an almost radical devotion to reducing any and all barriers to entry, except in this domain somehow people feel that the rigidity and exclusivity which journals provide somehow lead to the betterment of science. Makes me think about pg's "what you can't say" post.

There is short shrift given to the distortion that gets introduced in order to "get published". The "file drawer" effect is very real, and it gives us a very skewed view of what the actual scientific work being done really is; p-hacking and other statistical trickery are table stakes to anyone trying to survive in the academic world, as being published is a key metric to success yet it is surprisingly orthogonal to being right. As Brian Nosek's "reproducibility project" is uncovering, there is a non-trivial amount of "science" out there that is pure garbage, and much of it was produced in order to "get published."

In a world where "random hackers" can write twitter bots to parse scientific papers and uncover obvious mathematical flaws (which sometimes invalidate central claims of the paper), I don't understand how we aren't immediately gravitating to standards which promote more and more openness.

Lastly, the biggest issue we have to address is the fact that notoriety is now the chief indicator of success, and not genuine scientific discovery. In college I had world-famous faculty who have made a career, fortune, and placed themselves well on the spotlight on the backs of research which was later completely reversed -- and not in the way "science is supposed to work" sense, but rather in the "you purposefully fudged the math" sense. And yet there was virtually no reputational damage done to their career. Worse yet, I think it is clear that this lack of ethics actually is a direct cause of the success they enjoy today. Given that these are pressures faced by all in the academic community, true openness is the one avenue we have to counteract this sort of thing, and hopefully enterprising individuals come along and find a way to make "truth-seeking" reputation something researchers care about.

colmvp · on April 17, 2017

One of the reasons why I sometimes dislike my industry (design) is it's filled with a lot of Medium posts that get highly upvoted by say, emerging or other adept designers, yet it's rare that someone will substantiate their claim with data to prove that a rule or belief that they state is actually true. Sometimes I feel like designers believe x only because other designers believe x, when in reality non in-group behavior might suggest something completely different, that x is irrelevant, unnecessary, or confusing.

What I enjoy about learning about other subjects that make claim in research papers is that it is at least legitimized through a rigorous process. One can substantiate that y is better than x because tests show doing y requires less computing time than doing x. Or that doing y results in higher accuracy of the model. While getting a research paper published is an imperfect process where even high quality ideas get rejected for at times pretty trivial reasons, I can appreciate the process itself mostly results in a sharpened idea.

YCode · on April 17, 2017

> Sometimes I feel like designers believe x only because other designers believe x, when in reality non in-group behavior might suggest something completely different, that x is irrelevant, unnecessary, or confusing.

Parallax scrolling sites come to mind. They look pretty, but quite often are terribly dysfunctional to navigate and use for a customer who just wants to learn about or buy(!) your product.

aggie · on April 17, 2017

Is _your_ assertion backed by data?

Here's a study that disagrees with you, at least in terms of how many people have trouble with parallax (2 out of 43). http://uxpajournal.org/the-effects-of-parallax-scrolling-on-...

Not to say that study is a comprehensive judgment on the design pattern.

YCode · on April 17, 2017

TL;DR for anyone wandering by:

> We hypothesized that PS would improve UX, which is defined in this study as the emotions that are aroused when a user interacts with a product or technology. The PS website was perceived to be more fun than the no-PS website. With respect to perceived usability, enjoyment, satisfaction, and visual appeal, there were no differences between the PS website and the no-PS website.

Great that someone formally studied it, though I'm a little put off by their subjects demographics as compared with potential customers of any given business. That is, I feel like students at a university are more adept at using interfaces than "the average user" who is going to be your potential customer.

1/5 factors they tested did signal added value, the 1 being "fun". More fun, but not more usable. I'm not yet convinced that fun correlates to sales in this context as much as usability, ESPECIALLY once the initial wow factor wears off.

I assume by "trouble" you mean motion sickness. Two of them had motion sickness, not just run-of-the-mill confusion from the usability of the site.

All that said, the main point I was making was that often designers spend a great deal of time trying to get PS working just right not because they think it will bring in more customers or make their site more usable, but because that's what other designers are showcasing and they want to emulate that trend.

pc86 · on April 17, 2017

That's because a lot of folks on HN come from Harvard or Stanford or MIT, where it's drilled into you that this pedigree is super important. There are companies that will not hire you for an engineering role if you don't have a CS degree from a short list of universities unless you're part of an acquisition (and even then).

This gets ignored the second they get $800k in VC money but for some reason the same idea never translates to "real science" or academia.

thr0waway1239 · on April 17, 2017

I support your message in spirit. But I have a concern.

The more open publishing becomes, almost by definition, the more dilution happens to the citation process. Want to know where this happened before? Trace back to Google's origins, where the "brilliance" of the "citation" technique made it so much better to begin with. But what happened afterwards? Was the interlinking just left as it was, pristine and still as valuable, just waiting for Google and other search engines to exploit it? Of course not. Any kind of system which can be gamed will be gamed. And what happened to the internet after Google became the sheriff? I would say it is now a fairly overcomplicated and twisted system (especially for the hobbyist) with the cat and mouse game going on between SEO companies and Google.

The distortions that are introduced today in order to get published, will be replaced by new distortions that will be introduced to get cited.

kain2d2 · on April 17, 2017

> In a world where "random hackers" can write twitter bots to parse scientific papers and uncover obvious mathematical flaws (which sometimes invalidate central claims of the paper), I don't understand how we aren't immediately gravitating to standards which promote more and more openness.

I understand your idea but the thing is, nobody has tone of free time to read an unknown hacker's blog (if he/she really does). People are busy for working so the best way is to read a peer-reviewed paper. This thing is like a brand, I guess.

ChicagoBoy11 · on April 17, 2017

The same constraints you cite here applied to "peer review" articles. The academics doing the peer reviewing are also "busy with work" and want to do their own thing, and especially in cases where reviewers are anonymous, the incentives facing the academics charged with the review are absolutely horrendous.

My point with the part about the bot that checks errors is that, if we are being honest, in far too many cases that "peer review" is less than ironclad, to put it mildly. In that scenario, the journals actually do us a disservice because they attempt to signal that the content you are reading went through a strict bit of scrutiny before it reached your eyeballs. That is a dangerous assumption to believe in if it turns out that it really wasn't.

ylem · on April 17, 2017

I guess it depends on the size and culture of a given research community. Researchers are people too and petty things do happen. But, as a referee, I have followed through error propagation (once I saw erorbars that looked amazing and wondered how I could get such beautiful results and discovered they made an error in the error propagation). When reviewing methods papers, I have checked the integrals and equations, looking for errors (including minus signs) because eventually it will go into some code that most people won't read. Then, there are just simple checks on the science and on history of the field. I don't think that I'm unique. Refereeing well takes time, but it's part of the responsibility/duty of being a researcher. A friend's metric is that he should referee at least as many papers per year as he publishes.

By all means, when you read a paper, be skeptical and go through it yourself, but in my field, referees do actually provide some quality control.

ChicagoBoy11 · on April 17, 2017

This is as it should be. But the reader of the article has no idea whether you looked at it or someone with a far less significant commitment to the task than you. And what's keeping you and your friend doing this time-consuming work is your commitment to science and your sense of ethics and responsibility, which your community certainly does not universally and uniformly share. That commitment is also the first to go when career, money, lack of time, etc. start coming into the picture, or the pressure to abandon it is certainly high.

That's what we need to find a way to alter: If you are doing great science -- including the painstaking work of carefully reading a paper and following minus signs through calculation and checking integrals and whatnot -- that should be explicitly rewarded in the profession, and shouldn't just rely on intrinsic values a researcher holds in order to survive.

I don't see journals today being the leaders in driving that innovation.

ksk · on April 17, 2017

You're inflating the flaws, while diminishing the success of the existing system. Pick a field and pick the top 3 journals. Now how many of the papers published in those journals were junk science? The fact that you have unscrupulous journals and editors, doesn't go away with blogs. Its all the same. Personally, I don't care what the medium is, but I'm worried that organizations whose purpose is to highlight good work might die out. I think the problem is similar to selecting 3 resumes from a pile of 2000 resumes. If you look at the existing HR system, yeah, it sucks and good candidates are often overlooked over bad candidates who get hired because they're ex-Google or w/e, but then again you can't just dump two thousand resumes on your dev manager, and have them go through each one.

Also, before you propose tearing down a system that has produced results for decades, but is flawed, you have to show that the replacement is actually useful and capable of at-least performing on par. I have no problem with casual discussion on HN (in fact I actively avoid the insane people who want "proof" for everything) so I don't really care to hold your comment up to a gold standard of absolute certitude, but you should probably give your ideas a bit more thought.

inetknght · on April 17, 2017

> In a world where "random hackers" can write twitter bots to parse scientific papers and uncover obvious mathematical flaws

Do you have an example of this?

ChicagoBoy11 · on April 17, 2017

I think a StatCheck-like clone was built that would then go on twitter and try to find the author, but I readily admit I couldn't find it again. But I think StatCheck itself is a good example -- just a bot that can go through a paper and point out math mistakes.

CJefferson · on April 17, 2017

One reason blogs are of lower scientific quality -- anyone can make series of large and unresearched claims, cherry picking a few examples to claim a broad point.

Good journals do have peer review, and they wouldn't have accepted this.

Also, one big disadvantage of blogs -- depending on where you host it, they often don't last long, and point 4 (easy to edit) is a blessing and a curse, particularly if people edit things without telling you they did it.

Of course, blogs have their place, I keep one myself, it's great for breaking news, explaining things in greater depth, sharing and understanding. And journals could learn from blogs / arxiv (make it easier to make corrections, allow discussions).

shubhamjain · on April 17, 2017

Getting a research paper published sounds so meritorious, no matter how idiotic or trivial it is. Take for example, Princeton 'research' in 2014 that predicted that Facebook will lose 80% of users by 2017 [1]. The evidence? Google trends data which has near-zero correlation with DAUs of Facebook. The research not only got published, but also got a wider-audience after major media websites covered it.

Had it been a blog post, I don't think anyone would have taken more than a cursory look at the analysis. Yes, anyone can fit data points and reach a conclusion with blogs, but isn't academic research fraught with similar problems already?

[1]: https://www.theguardian.com/technology/2014/jan/22/facebook-...

kilburn · on April 17, 2017

You picked a bad example. The actual source article is published at... arxiv. From an academic perspective it is worth exactly as much as a blog.

Academic literature has many problems but it is way above the blogosphere in that regard. And media coverage is going to suck no matter where stuff is published...

gpm · on April 17, 2017

> From an academic perspective it is worth exactly as much as a blog.

From the perspective of peer review, yes. But per google scholar that paper has been 66 cited times, including in many articles that are in at least somewhat reputable journals (published via Elsevier, ieee, Springer, etc) and apparently two books [0]. So I think it's pretty clear that being a "scientific paper from Princeton" was significant reputation wise regardless of whether or not it was peer reviewed.

[0] https://scholar.google.com/scholar?q=Epidemiological%20model...

I was also skeptical about the citations so I checked that two with freely available pdfs were actually this paper, they were

[1] https://pdfs.semanticscholar.org/4ebb/b654c59aff454ad97301e6...

[2] https://arxiv.org/pdf/1509.07805.pdf

bbctol · on April 17, 2017

Right, but it's not a scientific paper. It's a problem when people trust something just because it came out of Princeton, but the problem you're identifying isn't one with scientific papers (and the citations you've identified are places where they're using the Cannarella paper as an example of people being interested in applying epidemiology to social networks, not citing the research they did as accurate or important.)

mikk14 · on April 17, 2017

The majority of the papers citing it that I checked (including the two you have linked yourself) makes the citation for the idea of the method (using epidemiology models in the context of social media), not for the results.

pgbovine · on April 17, 2017

several of my blog posts have had high google scholar citation counts; doesn't make them scientific articles, though. still just my personal opinions.

geezerjay · on April 17, 2017

> Had it been a blog post, I don't think anyone would have taken more than a cursory look at the analysis. Yes, anyone can fit data points and reach a conclusion with blogs, but isn't academic research fraught with similar problems already?

Some papers are cited not because they are well written, but simply because they covered a topic which might not have been covered by other researchers. Literature review is a fundamental part of a paper.

madaxe_again · on April 17, 2017

I have no idea why it's so meritorious. I know people who are published, and don't even know what the paper their name is on is about, both at the outset and twilight of academic careers.

I also know a guy who published preliminary results and musings on his blog, and was then refused publication because it was "unoriginal".

Journal editors seem to not be familiar with the subject matter either, given what does and doesn't get published - why only ever positive results?

Unless scientific publication changes completely science as we know it will die. It is already becoming vanity, egoism, orthodoxy. Why publish? So you can get better pay later in the private sector.

kain2d2 · on April 17, 2017

Here are some of my thoughts after a quick scan:

1. George Box has a very famous quote: "All models are wrong; some models are useful."

2. Even the model applies correctly for Myspace, it does not mean it is correct for Facebook. I would use a very common terminology - "survival bias/selective bias" for it. The only way to ensure the model is letting it through numerous amount of cases. There is not only one social network is active, can the author predict for them and for the old one? Here is the possible list: https://en.wikipedia.org/wiki/Social_networking_service

3. When they cited a paper, they are not supposed to agree with them. Some of paper even criticize each other. They may accept the concept of Facebook being cool down but not in couple of years (with different parameters). If you want to make sure they are all agree or not, please make a survey on this. Like I said, you (and probably me) are probably into survival bias.

4. There is a least one mistake here: "discussed in Section ?? [3],". So, was it well proof read?

tchalla · on April 17, 2017

You missed a point. A research paper ensures that the right methodology is followed to arrive at a result. It ensures a sufficient but not a necessary guarantee on the result itself. Besides, a research paper also seeks to inform under what conditions this particular result has been achieved. The peer review process - as flawed as it may be - ensures that at least 3 other people who understand the topic have gone over it and have given their vote of approval.

douche · on April 17, 2017

This seems a bit idealistic, or perhaps I'm too cynical. Reproduction rates on soft-science papers are worse than should be expected of a junior high chemistry class.

Peer review, looking at it from the outside as a layman, appears to just be an enormous rubber stamp. I cannot believe that academics have time to rigorously go over the mountains of pages of dreck that make up most research papers, and even read the whole thing thoroughly, much less double-check the numbers. Not when they have a stack of papers to review a foot high, and another two feet of grants to write, their own research to do, not to mention what little teaching they haven't farmed off to grad students.

BeetleB · on April 17, 2017

>Good journals do have peer review, and they wouldn't have accepted this.

Most peer reviewers do not get to see the data in the article.

rspeer · on April 17, 2017

When I submitted a paper to AAAI, there were dire warnings that the paper would be rejected if I included links to any supplemental materials (such as code or data), because this would compromise blind review.

The conference software they were using had a sketchy form for uploading supplementary data, but it appeared that they expected it to be used for small data tables or something. It certainly wasn't going to be an option to upload 12 GB of data into their conference software; they mentioned nothing about code, particularly how to specify its dependencies and the computing environment it needed to run in; and it also certainly wasn't going to appear in any form that was convenient to review if I did.

How could you submit code to blind review anyway? Do you make alternate versions of all the dependencies that don't credit any authors that overlap with the authors of the paper?

In short, because of blind review, I was forbidden from doing anything that would make my paper reproducible at review time.

kilburn · on April 17, 2017

You are confounding reproducibility and repeatability. Repeatability is the ability to repeat your exact experiment under the same conditions, whereas reproducibility refers to other people being able to reproduce your experiment under similar conditions and obtain congruent results.

Of course, both are needed for great science. Nonetheless, your paper alone should provide a good enough description of the conditions and methods you used for the paper to be reproducible.

In fact, it can easily be argued that the ability to just run your code instead of re-implementing it according to your description is actually detrimental. To see how, consider that a bug in your code, re-used by other researchers, can easily lead to multiple derivative works finding entirely wrong results. In contrast, if those other researchers re-implemented your method there would be a much lower probability of them doing the same mistake you did, leading to incongruent results and hence raising alarms that probably lead to the discovery of your bug. Although re-implementing your algorithms is significantly more work, the overall quality of our research would benefit from doing so...

rspeer · on April 18, 2017

> Nonetheless, your paper alone should provide a good enough description of the conditions and methods you used for the paper to be reproducible.

Too bad there's an 8-page limit, then.

kilburn · on April 18, 2017

8 pages now? It used to be 6. Anyway, 8 pages can fit a lot of explanation/discussion. Also, keep in mind that AAAI is a conference, where you are supposed to present promising research that you are still edging out. Finished works are supposed to be presented in journals, where space constraints are typically more relaxed.

On a personal note: I'm sick of smart-ass reviewers, unprofessional researchers, results over-selling (to put it mildly) and so on. I think the whole system is so corrupted that i just quit research in despair. Unfortunately, I don't see how blogs and/or just "publish the code" would magically solve all those problems.

Anyway, publishing your code is certainly a good thing to, so I applaud and encourage you to keep doing it!

rspeer · on April 18, 2017

> Also, keep in mind that AAAI is a conference, where you are supposed to present promising research that you are still edging out. Finished works are supposed to be presented in journals, where space constraints are typically more relaxed.

This is a dated view of AI research.

Promising research appears in workshops, blogs, and/or arXiV. Finished research appears in conferences. Nobody's sure what journals are for.

kilburn · on April 19, 2017

Talk to your advisor about that. Everybody is pretty sure that journals are to support your academic curriculum: good luck getting postdoc positions without Q1's under your belt.

Of course if you are in one of the Ivy league universities this may be different. Otherwise... yeah, you can feel research moves too fast for journals, but curricular evaluation practices moves even slower.

BeetleB · on April 20, 2017

>Talk to your advisor about that. Everybody is pretty sure that journals are to support your academic curriculum: good luck getting postdoc positions without Q1's under your belt.

I agree with GP. The view that conference publications is less important than journal papers is inaccurate, depending on your field.

In my field, it's as you describe. Conference papers are not polished, and journal publications is what matters in your academic career.

For many of my peers in some disciplines in CS, it was the opposite. Getting into a highly regarded conference was much more valued than publishing in a journal.

Then I found this was not limited to CS.

It really just depends on your discipline's culture.

lucb1e · on April 17, 2017

> depending on where you host it, they often don't last long

I wrote my bachelor's thesis last semester. You wouldn't believe for how many papers I needed to use archive.org. By the end of it, I made a small donation and recommended the company to do the same.

Blog posts might not be better, but hosted papers aren't that stable either.

stinos · on April 17, 2017

anyone

I think is they key point here which the OP seems to ignore completely: I sort of agree with everything the OP states, but he also forgets some key aspects on blogging. For instance anyone can post whatever even on subjects he/she really doesn't know a lot about. Also, not every blog post does get reviews from all sides and might only get 'yes good' replies originating solely from confirmation bias while the lack of any criticism doesn't make everything right but for instance just indicates the critics didn't find their way to the blog post.

ansgri · on April 17, 2017

There's an inverse side to that "anyone": professional scientists are under pressure to publish long scientific-looking articles, and their fellow reviewers have almost no motivation to spend any significant time reviewing (no pay, no nothing, just a small non-quantifiable reputation penalty for declining to review).

Whereas most scientific bloggers post things they are interested in and therefore feel have something to say.

stinos · on April 17, 2017

professional scientists are under pressure to publish long scientific-looking articles, and their fellow reviewers have almost no motivation to spend any significant time reviewing

You have a point but from all scientists I know myself there's not one for which this applies. Might depend on the field though and I'm not denying there's a bunch of impossible-to-reproduce-crap published and there are rotten apples everywhere. But what I see (in fundamental research which is usually not as publicly known as other kinds) doesn't come near what you describe, even though it's anecdotal of course.. Yes there is pressure to publish but their papers aren't 'scinetific-looking', they're properly scientific and they are just the length needed to describe the findings on the subject. And there is an abundance of motivation for reviewing mostly inspired by wanting to make sure everything is as correct as possible, for the sake of research.

ansgri · on April 17, 2017

Thanks for the balancing point. My experience describes applied research in such "hot topics" as computer vision and machine learning, and many people, myself included, are more interested in getting the technology to work and apply it in real commercial projects, where paper publishing may seem like a price to pay for research grants subsidizing your R&D.

These grants are aimed at building a working technology and strengthening national economy, so there's nothing wrong with the approach per se, but when your reviewers share your values they have little motivation to find every possible mistake in your paper, at least for second- or third-rate journals (which are still indexed in WoS so are perfectly enough for the funding agency).

TheOtherHobbes · on April 17, 2017

The value of content depends on the collective intelligence of the network around it.

Good networks concentrate intelligence by providing feedback and support that helps the network converge on useful, original insight.

Bad/malicious networks destroy it with noise and false information, creating a feedback network that suppresses reality-based original insight.

The process of blogging is irrelevant. So is the process of peer review.

It's the quality of the networks around each area of interest that defines the real value, not the process.

E.g. peer review works well when it's part of a high quality network. When it isn't, it's no better than random posting.

marcosdumay · on April 17, 2017

Do not underestimate the effect of tooling on the shape of a community.

Free and gated idea exchange create completely different communities, with completely different problems and advantages.

afandian · on April 17, 2017

(to repeat a comment I made on the blog)

We're seeing a huge amount of non-traditional scholarly activity (if you'll pardon the dry phrase) happening in blogs. Alternative metrics aka altmetrics (as distinct from altmetric.com) have been taking in to consideration the scholarly activity that happens around traditional publishing for a while. Crossref, the organisation who brought you DOIs for scholarly publications, thought that it would be a good idea to help collect this kind of data as a counterpoint to traditional publishing and citations. We're building Crossref Event Data, which is a free (libre, gratis) service for collecting mentions of articles on blogs and social media, so that it can be used by the community in all kinds of ways. Discoverability, recommendations, and yes, maybe more metrics. How you use it is up to you.

The article raises a good point about blogs as primary methods of publishing rather than, for example, as a venue for the discussion of traditionally published articles. Establishing an open 'citation'-graph-like-dataset of blogs is a good first step toward that.

We're heading into Beta soon, and you can read more about it https://www.crossref.org/services/event-data . The User guide is a work in progress, but might answer any questions you have: https://www.eventdata.crossref.org/guide . You can also contact me at jwass@crossref.org if you have any questions.

mikk14 · on April 17, 2017

People here seem to attack only the argument about "anyone can write a blog", but I was expecting HN to give a couple of thoughts also about his first point. To me it reads something like: "publish all your data out in the open for everybody to see".

I find it extremely problematic. Sure, sharing data is necessary to verify that the conclusions are supported by it and they are not due to methodological errors. But to anyone? Out in the open? I would expect better data ethics, especially from a psychologist.

Your data can contain political opinions, health records, sexual orientation, contact information, the places a person has visited, and when. People sign up for these studies usually with the agreement that the information that can harm them cannot be freely shared, unless to people involved in studies with similar data protection systems. Data like I mentioned has sometimes to be stored in computers not connected to the Internet, to reduce the risk of data leak. Free access to this data paves the way to persecution and shaming.

If I were to sign up as subject to a psychology study and had a person with his ideas leading it, I'd withdraw immediately. I'd question if this person should be a psychology researcher at all. Sharing data is good, but protocols are there for a reason.

yomly · on April 17, 2017

This is obviously a problem that will be difficult to solve for social and medical sciences, but surely for the life and engineering sciences a general stance towards openness would be beneficial to the disciplines?

mikk14 · on April 17, 2017

Sure, I can concede that -- and by the way I still concede that data sharing done right should be done in social and medical sciences too.

However, he put in the introduction of his post a warning about his field, as if he expects that his arguments applies especially to experimental psychology.

metalliqaz · on April 17, 2017

Personally identifiable details can be removed from data.

mikk14 · on April 17, 2017

There is a seminal paper showing how carelessly adopting this point of view can lead to disasters: "87% of the U.S. population is uniquely identified by date of birth, gender, postal code" [1]

In many other cases, as pointed out, it's just not possible. I'm studying mobility patterns through cellphone metadata. Even if you strip out the actual phone number with a random ID you still know where a person is going, and thus re-identify them if you have other public data.

[1] https://en.wikipedia.org/wiki/Latanya_Sweeney

user5994461 · on April 17, 2017

full name + date of birth + place of birth.

That leaves about 500 non unique individuals in a country of > 50 million inhabitants.

BlackFly · on April 17, 2017

The very depressing opposite conclusion was discovered in the late 70's:

DE Denning, PJ Denning, M Schwartz, ‘‘The tracker: a threat to statistical database security’’, in ACM Transactions on Database Systems v 4 no 1 (1979) pp 76–96

A general tracker can always be found, unless the data released is extremely restricted. Almost anything is personally identifiable as it can be used to build a tracker into the database.

I am aware of this result from chapter 9 of Security Engineering (http://www.cl.cam.ac.uk/~rja14/book.html) by Ross Anderson, if you are more generally interested.

Jweb_Guru · on April 17, 2017

They actually usually cannot, especially if you want to still be able to do research into them (which is the motivation behind differential privacy).

tkt · on April 17, 2017

Titus Brown also a blog post on this earlier this year "The top 10 reasons why blog posts are better than scientific papers" http://ivory.idyll.org/blog/2017-top-ten-reasons-blog-posts.... and it generated similar discussion.

There are a few elements they emphasize.

One is what the blog format enables that traditional publishing doesn't support. Those are things like having real-time feedback and comments, being able to version and making the blog post interactive, rather than a static document. Another element to the format is a lack of gatekeepers, so it can be quickly disseminated and disseminated by anyone, so there aren't barriers to participation in the scientific discourse.

Another is norms and expectations. In blogging, it is more the norm that data and code are open. Open is still possible in traditional publishing; it just isn't yet the norm. A new format however, enables new norms and it's easier to set them from the start, than try to revise existing ones.

Finally, there's the element of 'correctness'. Going through peer review and being in a traditional journal certainly doesn't ensure that the paper is correct. You can just look at retractions to see that http://retractionwatch.com/2011/08/11/is-it-time-for-a-retra.... However it would be interesting to see more evidence around whether the blog format does ultimately lead to more 'correct' conclusions, on the whole, and not just for the posts that lead to a lot of discussion.

mrdrozdov · on April 17, 2017

Is it so hard to publish at arxiv, share the paper on twitter, and submit a link to HN? At a minimum, arxiv is better than a blog post because there's some sense of versioning.

For anyone who is curious, some conferences share the peer review comments (even though they do not share the identities of author or commenter). Here are the comments from iclr2017 (https://openreview.net/group?id=ICLR.cc/2017/conference) and here are comments on Ian Goodfellow's GAN paper by Jürgen Schmidhuber and others (https://media.nips.cc/nipsbooks/nipspapers/paper_files/nips2...).

pdkl95 · on April 17, 2017

For some time I've thought that the concept of a "journal" artificially forces together several features that can exist independently. This usually includes something roughly similar to: (not a complete list, or in any particular order)

1. people that want to publish their research, theories, response, etc

2. a mechanism for having submissions peer reviewed

3. curating which submissions will be included in the publication

4. actually physically (or electronically) publishing and disseminating the papers to the interested community

5. people actually reading it and responding (possibly recursively)

6. reputation and status derived from authorship, being referenced by others, etc

7. archiving the papers - and hopefully the data

These features are traditionally part of the same process for practical reasons like the cost/time to physically publish before modern printing technology. This worked, creating resistance to any change. Now, I suspect that over reliance on the concept of a "journal" limits your thinking. This article demonstrates some of that with it's framing of blogs as an alternative or competitor to journals.

Instead, consider that modern computing and the internet make #4 very easy and almost free. We have various ways to archive (#7), which include actual verification (e.g. "git fsck", signed commits). We already curate (#3) as a separate step with specialty blogs and aggregators like HN.

I'm not saying journals are bad or obsolete. I'm just suggesting that there might be better ways to organize the process, and that different granularities of "journal"-like process can probably coexist.

BHSPitMonkey · on April 17, 2017

Many of these topics are constantly being championed by the somewhat obscure digital libraries space. These departments are mainly found in universities, and they focus heavily on the pipeline you described. Despite the point you made that throwing content on the internet is usually easy and "almost free", it's actually a pretty large and costly effort when you're trying to do it at scale. A lot of clerical and engineering effort goes into tagging/organizing/storing/replicating the data, guarding against bit rot and other integrity errors, and making the data easily searchable and browsable (and that's to say nothing of the political and psychological battles involved in simply getting researchers to follow through and provide the data to these processes, in the proper formats, before they finish their degree and disappear).

If the hacker side of these challenges interest you, I suggest checking out the code4lib community: https://code4lib.org

marcosdumay · on April 17, 2017

Still, compared to what it used to cost to just distribute it, and compared to the cost of journal subscriptions, it's approximately free.

dahart · on April 17, 2017

> Blogs have Open Peer Review

No, they don't. This is conflating having an opinion with the scientific review process. The scientific review process can prevent publication of bad research in journals and require changes before publication, the blog "review" process cannot do either of these things.

shouldbworking · on April 17, 2017

I feel comment sections on well targeted sites like HN are better than peer review sometimes.

It's not uncommon for an industry expert or creator of a product to Koolaid Man into the comment section around here.

marcosdumay · on April 17, 2017

I didn't stay any long on the academic world, but the comments on targeted tend to be better than nearly all peer review I've seen there.

The thing is, the very small exception is incredibly good.

westurner · on April 17, 2017

So, schema.org, has classes (C:) -- subclasses of CreativeWork and Article -- for property (P:) domains (D:) and ranges (R:) which cover this domain:

- CreativeWork: http://schema.org/CreativeWork

- - BlogPosting: http://schema.org/BlogPosting

- - Article: http://schema.org/Article

- - - NewsArticle: http://schema.org/NewsArticle

- - - Report: http://schema.org/Report

- - - ScholarlyArticle: http://schema.org/ScholarlyArticle

- - - SocialMediaPosting: http://schema.org/SocialMediaPosting

- - - TechArticle: http://schema.org/TechArticle

Thing: (name, [url], [identifier], [#about], [description[_gh_markdown_html]])

- C: CreativeWork:

- - P: comment R: Comment

- - C: Comment: https://schema.org/Comment

4258HzG · on April 17, 2017

#1 Reason: Blogs aren't included in the publication count which is needed to gain research funding, get hired as a professor and get tenure.

A bit cynical, but a lot of the problems the writer suggests are more of a symptom of trying to get as many papers out as possible than the cause (and that one rarely reviews the supplementary material). If scientific blogs counted like publications many of the same problems would appear there.

lr4444lr · on April 17, 2017

The author calls "Nature" a "low quality journal". Is that really it's reputation?

capnrefsmmat · on April 17, 2017

Its reputation among people like Lakens, and others complaining about the poor state of statistics and rigor in modern science, is that it features overinterpreted overhyped results from small underpowered experiments, and when errors are pointed out it's usually not too interested in correcting them. The editors are more interested in "important" results than methodologically rigorous ones.

It's similar to how Andrew Gelman always sarcastically refers to the Proceedings of the National Academy of Sciences as the Prestigious Proceedings, because that's invariably the adjective used in news reports, yet many of the articles are of the "female-named hurricanes cause more damage" type.

toufka · on April 17, 2017

It has a reputation of being very splashy. If you have really good work it gets put there, is bright, is well-regarded. However, of the bench-scientists (not the professors) who have published in their journals, it is clear that their style of compressing a large amount of ground-braking work into a page or two of text is really not in the science's best interest. They have a journal specifically called 'Nature Methods', where new protocols and methods go. Most of the time it is actually impossible to cram the actual method being used into their allotted space. So they either get dumped into the (not-reviewed) supplement, or are not included at all. A paper from Nature usually lacks required data/detail that other journals provide - and this is especially problematic when coupled with being 'cutting-edge' science.

So you get this effect of very hot research that the professors can gloat over, relatively easy to read, well-publicized, but actually are lacking in much scientific data or detail when it comes to the actual text. In the field I work in I need access to DNA and protein sequences that should all be freely available. When I find a Nature paper the sequence is invariably missing, or presented in the supplement as a PNG screenshot of a microsoft word file. Again, it's not a detail a professor or casual reader would care about - but it's a detail that's absolutely critical to the science and for replication.

Most journals are neither so condensed to the exclusion of data/protocols, nor so well-read as to actually create issues when it comes to replicating, correcting, or retracting them.

mattmcknight · on April 17, 2017

"Nature" comes in for a lot of criticism because they have a news division that publicizes results widely, yet does very little when those results are retracted...and they are corrected or retracted increasingly frequently: http://retractionwatch.com/category/by-journal/nature-retrac...

stinos · on April 17, 2017

Depends on who you ask probably, but in the scientific world it's generally not regarded as low quality, rather the opposite. Also it's impact factor suggests that. Though the way that is calculated is also under debate by some, I'd say it still gives an indication of quality in general.

collyw · on April 17, 2017

No. For biology at least it is the one to aim for.

IanCal · on April 17, 2017

Unless they're properly archived along with all their resources as well as only referencing other properly archived material, I'd be very cautious before saying they're better.

This can be done, but I suspect it is rare whereas it's the norm with a journal article.

steventhedev · on April 17, 2017

I think they serve different purposes, and the quality varies heavily across subfields. Academic research will produce a journal article as the primary output, whereas a blog post is typically a side effect of a practical project, and most will contain a link to the actual code.

Ultimately, I think they're both good, but they do have different uses.

IanCal · on April 17, 2017

I'm still highly concerned about the loss of these things. I very regularly follow links to code that's just gone. The code for research definitely should not only exist in a related of post link.

Archiving, for me, is a requirement if you think it should be used over more than a year or so.

steventhedev · on April 17, 2017

Obviously, having a professional librarian working on archiving data, code, and articles is ideal. However, I've found that the public facing side of journals experience bit rot even faster than most of the older blogs. Of course, there are exceptions, and even the most public figures can decide to delete everything, code included (e.g. _why)

Gwern has tackled link rot[0] in an exhaustive way, and may give you a starting point from where to continue.

[0]: https://www.gwern.net/Archiving%20URLs

afandian · on April 17, 2017

Zenodo is a good example of this, as is Figshare. They allow the assignment of a DataCite DOI to some kind of digital asset (dataset, code, github repo etc) for citation https://zenodo.org/

aviggiano · on April 17, 2017

It seems the future lies in something like GitHub, stack overflow or quora for articles, where people with the best code/answers naturally gain more reputation and where you still have an incentive for peer review and open data.

colorincorrect · on April 17, 2017

As in, "if you follow these best practices, you too can produce high quality research"

There is no reason why this post should be focused on blogs in general, asides from generating discussion + clickbait. The key here is actually following the best practice, but do most blogs actually do this?

Alternatively: there is a high minimal requirement for research articles, but there is no minimal requirement for blogs.

88e282102ae2e5b · on April 17, 2017

I think it might be better to say that contemporary psychology is frequently not being performed scientifically.

notahacker · on April 17, 2017

Advantages that scientific journal articles have over blogs is that they are:

(i) not written in clickbait format

(ii) impact is measured in terms of other research which sees it as a relevant study to cite rather than numbers of pageviews (see also (i))

(iii) one of the things a peer review process can flag up is that bold and implausible claims like "On blogs, the norm is to provide access to the underlying data, code, and materials" probably need some form of qualification for what is meant by "blogs" and/or quantitative evidence to support that claim

Blogs can be well reasoned and evidence-based, comments can (even more infrequently) be enlightening and journal articles aren't immune to flaws, but the linked blog article is a prime example of why arguments expressed in blogs are seldom afforded the respect of arguments advanced in journals.

lucb1e · on April 17, 2017

> (i) not written in clickbait format

Really? I find them to be incredibly clickbaity whereas blog posts are much more matter-of-fact. Blogs are almost never a commercial entity/enterprise and while no writer would shun search engine traffic, they usually write for peers, not for publicity like journalists or scientists, both whose funding depends on this.

notahacker · on April 17, 2017

I have yet to see the stereotypical "5 reasons why [bold claim involving zero research]" format the linked blog post uses uses in any academic journal.

Scientific writing in closed-access journals is the definition of writing for peers, and never includes AdSense, affiliate links or digital tip jars.

baali · on April 17, 2017

One more related link on the same subject, "The top 10 reasons why blog posts are better than scientific papers": http://ivory.idyll.org/blog/2017-top-ten-reasons-blog-posts.... * there is a "maybe satire" tag added by the author to this post.

pjc50 · on April 17, 2017

"Impact factor" systems incentivise the production of poor quality journal articles. Effectively researchers end up needing to produce a certain amount of "journal-bait" in order to stay funded.

(See HN passim "Every attempt to manage academia makes it worse")

agumonkey · on April 17, 2017

There's a lot of publications. Some of them are really small step attempts at understanding. Sometimes it's just a test. A careful one, but nothing more. I guess in some fields we're still at the poking level.

Spooky23 · on April 17, 2017

IMO This is one of those cases where people are committed to an existing system and have bent their thinking around that system to the point that they cannot see the big picture.

There's alot of value to the existing journal system, but perhaps it would be appropriate to have a more formal process and a more informal/collaborative process. Maybe that would address the problems with having so many journals.

afandian · on April 17, 2017

Can anyone recommend scholarly / academic blogs (or aggregators)?

pjc50 · on April 17, 2017

http://blogs.sciencemag.org/pipeline/ "In The Pipeline" is good and has been running for years.

afandian · on April 17, 2017

Thanks! Looks great.

logicallee · on April 17, 2017

They left out the biggest one! If you say something dumb in it, the comments will crucify you for it :)

daxelrod · on April 17, 2017

And also if you say something controversial.

logicallee · on April 17, 2017

What you point out is an interesting effect. Blog posts that are controversial know that there's a chance a sh$#storm will ensue in the comments. In the case of general bloggers, such as researchers i.e. not people who are actively trying to troll or generate pageviews, but on personal blogs, how does this affect writing that is controversial?

I don't know academic publishing well enough to compare. I think that as far as controversial claims that would make it to somewhere like Hacker News, the effect is that if the controversial statements are wrong, then the comments will shoot them down, and the story will be buried. Furthermore, the rest of the comments will pile on the first comment that points out that the blog post is mistaken.

Perhaps an effect of this is that you can tell from the tone that some blog posts take, that they are being extremely careful to lay a rock-solid foundation, and kind of "convert" their readers who might otherwise vehemently disagree with them. For a recent example, check out this blog post:

https://gregfallis.com/2017/04/14/seriously-the-guy-has-a-po...

I actually just noticed something really interesting! The very first words in this blog post are: "I got metaphorically spanked a couple of days ago."

Would any academic article on any subject start in this way?

So the effect of the backlash against controversial statements, on the writing style of blog posts, is interesting and in many cases highly visible.

It is hard for me to compare this with academic writing. (But it certainly wouldn't start with words like that.)

ylem · on April 17, 2017

I can't comment on the field of psychiatry. However, I will comment on my field of physics. The author seems to completely neglect the signal to noise problem. There has been an explosion of journals and scientific publication. It's rather hard to keep up to date on the literature and editors do provide a valuable function in filtering what they think is interesting (which is further filtered by the referees). Also, for the referee process--it's not perfect and will probably not catch complete fraud. But, it is good for checking the basics of whether or not the story being told in a paper is consistent. Also, again related to signal/noise, the editors hopefully choose referees with a minimum level of competence in the field (not always, but usually). Also, in terms of dilution--if there's a blog, there's no guarantee that a critical mass of people will read it to try to give it a critical review. Whereas, even for a journal that is not widely read, the papers in it will have received some peer review where someone in the field tried to take a look through it for obvious errors. I am not convinced by open peer review. Let's consider two scenarios. Suppose there is a leader in the field trying to publish something (or perhaps even a friend or collaborator). If I notice that they are wrong, am I more likely to call them on it if I am anonymous or if they know who I am? One can be idealistic, but researchers are people too. I think that the anonymous review allows people to speak frankly (though it does sometimes let people be mean--that seems to be the cost of anonymity in forums unless there is strong moderation). Let's consider another scenario. Suppose I am a junior researcher and I get a referee comment that I determine is wrong. If it comes from the leader in my field, maybe I won't push back as hard as if it came from someone else junior--again, in an ideal world, it wouldn't matter, but researchers are people too. It would seem that the editor knowing the identity of the referees would serve as a check on bias, not 100 percent, but I think that more is gained than is lost from a closed referee system.

In terms of error correction, there are errata that are published if someone finds an error in a major study and in journals like the physical review are linked back to the original article.

The open data problem is hard. In my particular field, our raw data is available on the web, but the problem is that the meta-data needed to interpret is not. And that's nontrivial. For example, I may perform several operations on my data (for example, background subtraction) before fitting it to a model. We are experimenting with dataflow languages where we embed the series of "filters" that we apply to the data, but if we want someone to be able to reproduce this 20 years from now, then there's a whole ecosystem that would have to be maintained--for example, not just my code, but all of the libraries that it depends on to run. We can describe the basic process in the paper, but for true long term reproducibility, it's a hard problem...There are groups that are working on open-data and reproducible research, so I don't think that there isn't interest, just that it's not as easy a problem as you might think. But, for reduced data, I agree, it would be nice to have that available for papers in a machine readable format...

Open access. This is also a hard problem. Increasingly, funding agencies are requiring that publications be made available in an open format after some embargo period. I think that may be the best we can hope for. Just paying for competent editors requires funding. If we want additional features like data attached to papers (for decades), it will take more funding. That money has to come from somewhere. For typical open access journals, the author pays, but that seems to create difficult incentives--not to mention that it makes it difficult for poorer funded researchers to publish. I'm not sure what the right answer is, because I agree that the public should be able to see the results that they paid for--perhaps an embargo period is the best solution...

I think a blog is a great way for communication of ideas and for education, but I do not think that it is able to replace publication in a refereed journal.

rechecker · on April 17, 2017

  "Why can... "

Uh, shouldn't that be "How can... "?

scottmf · on April 17, 2017

soufron · on April 17, 2017

Can't they?

leecarraher · on April 17, 2017

blog post on blogspot advocating that blogs are better than journals seems a bit biased, unfortunately i don't think i can get a journal article about journal articles being better, accepted in any peer reviewed journals. checkmate daniel.

JohnStrange · on April 17, 2017

In my opinion blog posts have no scientific quality at all, since they are not peer reviewed. They may occasionally have quality, but that's another matter.

To be fair, some fringe journals have no scientific value either, because they do not peer review properly, but they are easy to spot and everybody in the scientific community knows them.

On a side note, originality is one of the key requirements for publication in a reputed journal and I have not ever seen a blog post that made any scientifically original claims. But maybe that's because I don't read blogs very often.

matsemann · on April 17, 2017

https://en.wikipedia.org/wiki/Polymath_Project

Bunch of mathematicians collaborating through blogs.