Hacker News new | comments | show | ask | jobs | submit login
Daniel Kahneman “I placed too much faith in underpowered studies” (replicationindex.wordpress.com)
819 points by skmurphy 11 months ago | hide | past | web | favorite | 328 comments

People are starting to learn that the vast majority of "science" are poorly-controlled white papers that get accepted and are never looked at again unless it is by a group of replication-crazed people (or what I like to call "actual scientists") reviewing conclusions drawn from decades-old papers.

Discouraging replication in the tenure track is a large contributor to this. "Novelty" is literally written in the "guidelines for authors" sections of many journals. They want the newest, brightest, most headline-catching "research" to disseminate. And so do the educational institutions. No wonder why the incentives are so perverse.

On top of this, most accepted research is allowed to be published without open access, open data, open peer-review history (how many rounds did it go, what were the objections, how did the researchers answer them, etc), and with the aforementioned lack of replication.

It's incredibly frustrating being someone who loves science, works in the field of science, and is skeptical about the system, which used to be a prerequisite and is now looked at like luddite behavior.

The reality is that genuinely interesting, productive and novel results are hard to produce. Even for teams of highly intelligent researchers. The scientific industry is built on the assumption that such results are in greater supply - or more accessible - than they actually are.

I don't think I could have put it better than that. Very well done. I am reminded of a parallel quote by Warren Buffett regarding the lack of truly good investments:

"I could improve your ultimate financial welfare by giving you a ticket with only twenty slots in it so that you had twenty punches - representing all the investments that you got to make in a lifetime. And once you'd punched through the card, you couldn't make any more investments at all. Under those rules, you'd really think carefully about what you did, and you'd be forced to load up on what you'd really thought about. So you'd do so much better."

I'm really surprised he said that. It flies in the face of diversified risk and everything it affects, like Sharpe ratios.

Buffet is not widely recognized as a fan of any form of the efficient market hypothesis in general or of diversification-reduces-risk specifically.


"The way to become rich is to put all your eggs in one basket and then watch that basket." - Andrew Carnegie

The way Buffett invests is by reading all day, everyday and once in a while making a very informed investment. Most people haven't even reached the level of staying away from 'sure things' and panic selling during bad news.

How many people would've been able to stay away from IT stocks during the dotcom boom, to even not invest in the company of one of your friends. (Bill Gates and Buffett have known each other since '91)

How often do you, as a 'master', break rules that you would tell a 'apprentice' are immutable?

The common numbers thrown around are that with 20 stocks your portfolio risk is reduced by 70%. The corresponding ratios follow. So in that sense it's not that bad.

Of course that's just one form of risk. The less attractive aspect of a 20 stock strategy comes from the fact that the majority of the market's returns come from very few stocks - The 80/20 rule applies pretty well here. With only 20 stocks you'll probably miss out on the few winners that contribute all the market's gains. It's very easy to end up with a 20 stock portfolio with low risk/variance and low returns.

Of course if you are Buffet, then your goal is to pick 20 stocks that all outperform.

It doesn't mean you can't buy indexes or other diversified assets but rather that you should stick with what you choose and not try to time the market.

I think it's meant to be taken as the number of times you get to change your portfolio.

I don't think so. Buffett is more of a buy to hold forever kind of investor.

One way of thinking about this: Every time you add a new instrument to your portfolio you risk goes down. But return also goes down because there are only a very small number of excellent investments available.

This is a cultural holdover from the late 19th and earlier 20th century when novel results were much more plentiful. But now nearly all of the low-lying scientific fruit has been picked so the scientific endeavor needs to adjust its expectations. Unfortunately, culture changes more slowly than science.

BTW, the lack of new results is actually a good thing. It means we've got a lot of stuff figured out.

> But now nearly all of the low-lying scientific fruit has been picked so the scientific endeavor needs to adjust its expectations.

Of our current energy/resolution paradigms

We know that ~20% of the universe is Dark Matter and that ~75% is Dark Energy. All we really know about Dark Matter is that it falls down and doesn't really interact with itself or anything else, really.

Dark Energy? Yeah, whatever it is, it pushes galaxies apart, that's about all we got.

There are a LOT of pretty easy discoveries to be made yet, but we just don't have the engineering to make them yet. The LHC was a marvel of engineering, but it can't even come close to the energy required to produce (theorized) Dark Matter particles. If you tried to make a bigger one, you'd need an LHC with a radius larger than the Earth's. That's not going to happen anytime soon, not by a long shot. We need another method than particle colliders, probably a very large space-based telescope to stare at black-holes with jets.

Biology is also rife with 'easy' things to do, but ones that we just don't have the tech for yet. Strangely, with all of the data that the Googs and the Zuck have, we have not seen a similar paradigm shift in psychology, criminology, or economics. At least in public literature.

I didn't say there was nothing left to be discovered, I said that most of the low-lying fruit has been picked. Obviously there's a lot left to be discovered, but what's left is manifestly more difficult to figure out than what has already been figured out.

Food for thought: it could be that the already-picked "low-lying fruit" is responsible for new discoveries being harder to make. Once you get into a certain mode of thinking, or come to view the world a certain way, it becomes harder to see the world as operating any differently.

So, maybe it's not so much that we have stuff "figured out", as it is that we have mental models that, while they work well enough to make certain predictions, also constrain our thinking in terms of what phenomena are possible. The "EM drive" controversy is one example of this tension.

Very unlikely. We are where we are today because along the way there were a LOT of alternative theories considered and discarded, either because they did not agree with experiment, or because they lacked explanatory power. Science is actually very open to new ideas. We even know for certain that the current foundations WILL change (because dark matter, etc.) It's just that making progress is really hard because our current theories already account for ALL of the phenomena that are easily accessible here on earth.

>> because our current theories already account for ALL of the phenomena that are easily accessible here on earth.

I'm not so sure of this. Have you heard of Edward Leedskalnin, or his "Perpetual Motion Holder"? The textbooks that I have discussing magnetism do not explain this effect, which you can see an example of here: https://www.youtube.com/watch?v=eWSAcMoxITw

If you can explain this effect with our "current theories", I would love to hear it!

That's a neat trick, but I see no reason to believe that it's anything other than straightforward electro-magnetism.

It is a neat trick! Somehow those pieces of metal are storing a magnetic field like a permanent magnet, but release their "charge" when the field is broken.

There's another interesting effect where an electric charge can be pulled from that set-up. Lasersaber demonstrates that here: https://www.youtube.com/watch?v=S_ssUTRbRRs

I certainly can't explain it, and I don't think we have an adequate theory that does explain it (yet).

It's altogether possible that you're correct, but I'd say that most of the responsibility lies in this: if you look at pondwater with your naked eye, you can see cloudy pond water. If you look with a carefully made glass bead, you can see bacteria and protists, and thereby make a TREMENDOUS discovery: unicellular organisms. But if you want to get even CLOSER and understand more about what parts make up the organisms and other things at greater magnification levels, you're going to need a bigger, better microscope, and that will cost more money, require more care and understanding to correctly operate, and be able to scan less total area at a time.

In other words, much of the problem we face in the current scientific age is due to the fact that our instruments are now electron microscopes and Hadron Colliders, instead of carefully made glass beads and simple mirrors.

It's similar to problem faced in software of managing increasing complexity. No silver bullets known of yet for all those problems, just chronic application of powerful minds to the problem. :)

On the other hand certain technological leaps can create a whole field of new low hanging fruits.

For example within medicine when computer power and AI is advanced enough to help us much more connecting the dots between DNA, viruses, environment etc.

Yes, that's a very good point.

Yes, but that collider can't get to the energy levels needed.

edit: My original post was waaaay too agro, I was way out of line. I misinterpreted your original post as meaning -the minimum size of a larger collider than the LHC would be larger than the Earth's radius-, instead of -a collider larger than the LHC that could hypothetically produce dark matter would be larger than the Earth's radius-.

for posterity: That collider can't get to the energy levels needed because it doesn't exist. It doesn't exist because it's construction was cancelled after the project was defunded. Where it to have finished construction, it would accelerate protons up to 20 TeV. The LHC reaches 6.5 TeV. All of that information is clearly stated in the link I posted and is completely uncontroversial.

Stop bullshitting.

> BTW, the lack of new results is actually a good thing. It means we've got a lot of stuff figured out.

Not necessarily. If anything, the more we find out, the more we realise how much we don't know. Take quantum physics, that stuff slowed down since the 1930s.

It's not a bad thing, the journey's always fun too. But let's not kid ourselves that we're close to figuring everything out.

Personally, I interpret GP as we have already proven many of the true hypotheses that can be simply and easily formulated, based on the categories and concepts that we currently find intuitive.

And this shouldn't really be a surprise. Our cultures and societies have had thousands of years of evolution; our human instincts have had at least hundreds of thousands of years. In that time, both culture and instinct have picked up many true facts about the world that have helped us survive. A lot of science has been taking those facts and systematically investigating how far they hold.

So these both can be true: the more questions we ask, the more questions we have to answer, but also that many of the 'easy' hypotheses have been mined out, and any true advance in science will by necessity pull us further and further from the realm where our basic instincts and intuitions can usefully guide us.

Quantum mechanics being the prototypical example, of course.

> the more questions we ask, the more questions we have to answer, but also that many of the 'easy' hypotheses have been mined out.

Like a tree? The main branches are the most visible, and so the most tackled. The smaller branches are only realised when you're deep enough in it. That's where our age's advantage comes in, as we have many more qualified 'explorers' than before, but it also makes it all the more important that the foundational branches are solid.

But then again it may be closer to readjusting our lens. The big questions (of the natural kind, not innovation like AI) today still pertain what we take for granted, like quantum mechanics is making us question the nature of reality again. But much like the discovery of waves once threatened the credibility of light as particles, I'm optimistic that reconciliation can happen, and to ascend to another level ... but where can we stop? When can we say "This is it"?

Just musing..

My father used to illustrate the progress of science as an island in the Sea of Unknown. The more you expand the island by gaining new knowledge, the more you get into contact with the unknown.

But the things we realize we don't know are much harder to figure out, they take more resources (time, money, study participants, analysis, etc). That's the reason things are, and should be, slowing down.

We are, clearly, close to figuring out most of the easy stuff. For our current definition of "easy" based on the amount of work, computing power and strength of the mathematical tools that we consider normal.

When you can do multiple experiments and predict the results to 8+ digits of accuracy there is little room for improvement. Your theory also needs to predict the same 8+ digits for all existing experiments while being either simpler or allowing for some new prediction that's also accurate and in disagreement with existing theory.

There really are very few edge cases left for QM 2.0

> This is a cultural holdover from the late 19th and earlier 20th century when novel results were much more plentiful.

I'm not sure about that. As a former working scientist, I would say that the primary driver of this is the promotion and funding structure of science. I do agree thta a lot of low-lying fruit has been picked.

> the primary driver of this is the promotion and funding structure of science

Well, yeah, obviously. I'm saying that structure is there because it's a holdover from previous generations where novel results were easier to obtain.

The funny thing is the results where gained either through paced pondering or by ridiculously talented individuals like von Neuman. Big science for non-applied fields is an oxymoron. Once a principle is figured out you can chuck a ton of people at it. But before that arrives, you can't pretend everyone is von Neuman and put them racing against each others. It seems there is no room for time and craftmanship in the publish-or-perish approach to science.

If we are to measure science by its technological results, then we would judge Arrhenius' paper on global warming to be a failure because it didn't result in any iPhones being sold. The author illustrates a difference in value systems, but it's sort of inherent to those that one can't meaningfully argue one against the other. I think the problem is with our incentives, not our institutions.

No, I haven't. Thanks for the link.

or it could mean we're stuck in a local minima due to structural barriers

What constitutes the "low-hanging fruit" changes constantly. I suspect that trying to quantify them at any given point in time is as infeasible as it is useless.

Then all the more reason for replication. Shouldn't we focus on making the foundation as robust as possible, before building more on it? It's possible to discover something exciting, or completely novel, but "exciting" and "novel" shouldn't really be the sole pursuit of a scientist. Rather, all outcomes including 'boring' ones should be considered with equal attention.

Hard to produce is not the same as hard to reproduce.

Genuinely new results require developing new insights, new techniques or just making new guesses that happen to be right. All are rare enough. But once that is done, the information gets out and the experiments should be easily reproducible.

Some dispciplines get a lot of reproduction as a side effect of using last year's breakthrough research as "mere" technology for implementing next year's expriment. But I guess that sort of thing is less natural pyschology and sociology.

For many experiments, the cost (required effort) of reproducing them is the same as producing them, simply because you need to do all the same major activities once more. If they needed to spend an hour each with a hundred subjects, you need to do that too. If they needed to carefully run a long process on some materials, you need to do that to. If they needed to build a particular apparatus specific to that experiment, then you need to build or buy one.

And it's possibly more costly than the original since the initial lab likely was able to re-use some of their existing resources, trained people, infrastructure, lab animals, equipment (whatever matters for the discipline) which you don't have; they likely chose this particular experiment because it fits well with the stuff that they have that's not available everywhere. If they developed a particular experimental setup / gear / network of volunteers doing interviews in the subpopulation you're studying / cell or animal line / etc, then they're using that for many publications over the years, but you need to spend the whole cost to reproduce any one of them.

Good, point.

I was once an experimental physicist and never psychologist. In physics there is a lot of trial and error before you can get the experiment to do anything at all. Anyone who reads your paper needs fewer trials.

So I was going to agree with you that in fields where experiments work differently, reproduction can cost as much as the original study. But on second thought, that's wrong: and indeed it is the whole point of the thread.

There is always trial-and-error. You simply don't know if your theory is correct. And there is always risk of a methodolical weakness that you only understand after seeing the data. The trouble is after you have already spent all your time and grant money on your study, you have every incentive to paper over these issues. Indeed even if you are scrupulously honest, you will have to shove study in the proverbial file drawer - because you don't have the resources to try another iteration.

So I think it's still true that a new experiment costs more than reproduction, and many of the problems with studies arise from trying to pretend that cost doesn't exist.

The difference is that with a reproduction you know you are expected to succeed.

It's hard to find a Roman gold coin in farmer's field but once you've found your first one you can be more confident that the second will be easier to come by.

Meanwhile, false positives are as easy as ever to come across.

Easier, if you factor in how much easier P hacking is with modern software.

"Everything that can be invented has already been invented" - Charles H. Duell, 1899.

We don't know anywhere near "most" things. The supply of novel discovery is not the limitation.

"Everything that can be invented has already been invented" - Charles H. Duell, 1899.

Please don't use that quote. It's apocryphal; he almost certainly never said that.

It's not about how close we are to "knowing most things," but about how accessible new learning is. In a game of musical chairs, all the easy chairs are taken.

How about this quote from Lord Kelvin:

> The beauty and clearness of the dynamical theory, which asserts heat and light to be modes of motion, is at present obscured by two clouds. I. The first came into existence with the undulatory theory of light, and was dealt with by Fresnel and Dr. Thomas Young; it involved the question, how could the earth move through an elastic solid, such as essentially is the luminiferous ether? II. The second is the Maxwell–Boltzmann doctrine regarding the partition of energy.[0]

As I understand it, the two clouds were dealt with via special relativity and quantum mechanics.

[0] https://books.google.com/books?id=YvoAAAAAYAAJ&pg=PA363#v=on...

I think he used that quote with the year timestamp to show how silly that quote is.

Yes, and david927 is making the counterpoint that the quote is not silly now, nor was it silly then (because it was not made then).

Or to be exact, it's not that the quote isn't silly but that it's not a quote.

The average age of a Nobel laureate continues to climb at a steady rate. We are reaching asymptotic knowledge gains in all the major fields. This fact combined with the tenure track / novelty factor leads to what we see today.

There is, of course, an alternative explanation.

Our society has changed, dramatically, in the past 30 years. We've gone from a completely disconnected society with connections spanning arm's reach to a globally connected society with connections spanning just as far. Baby sitters have been replaced with digital tablets providing endless entertainment and ensuring no person, adult or child, ever need sit with their own recreation being solely within their mind. One needs only look at the slew of digital devices whipped out at any eatery, lest individuals be left to their own thoughts for an agonizing 10 minutes.

The internet and our technology has shattered nearly all educational barriers. Getting an MIT caliber education for free is now something any child in a developing country who has access to the internet can do, if so inclined. On the other hand they can find countless entertaining drama to read or participate in, find an endless supply of porn, engage in social media and other online sites that are specifically designed for addiction, or find an array of games and other forms of fast response stimuli all designed to make him or her feel good.

Einstein constantly related his tales of meandering strolls as he pondered his work. Another individual that has managed to achieve some great things in modern science is Stephen Hawking. He's mentioned spending decades thinking intently, even if not exclusively, on singular topics. And there is absolutely every reason to believe that is in no way bluster. Given his condition, his mind is his recreation.

Is the breaking down of information barriers in our society helping more than the massive commercialization and directed addiction towards all sorts of mediums and vices along with the rapid reward systems they offer, is hurting it? It's not just high level academia that's failing. Education systems to be deteriorating at all levels with US (which is the center of much of this change) scores in math and science rapidly falling behind much of the rest of the world. Healthfulness has fallen to dangerously low levels. And more. Regression is not impossible.

This explanation has no support in actual evidence whatsoever.

Finland has seen a clear decline in literacy after the emergence of smartphones in the last decade (every child past 7 has one).

Although, to be fair, the direct effect is explained by increased inequality:


Yet, I know how digital shit affects me. I'm not talking about the devices themselves, I'm talking about being flung up by some stupid pointless shit in social media or by the colorfull game which totally hijacks the reward mechanisms in my brain.

I totally fail to see how the effects which promote shallow, fast reward seeking cognition are not not detrimental... yeah yeah, it sounds like greeks whining about reading spoiling the mind but the difference is that our media is engineered to distact and hijack the mind for no valuable outcome whatsoever.

Sure, there are lots of positive effects but that does not mean one should not discuss the negative ones as well.

And no evidence against it. I believe it was intended as a hypothesis.

While most of the comment is indeed a very detailed and interesting analysis, the last part is unfortunately looking like a "it was better before" nostalgia bias.

I'm currently writing my phd project (after having worked for 10 years in a company) and I'm absolutely stunned by the current easiness to access papers and informations compared to just 10 year ago when I left university.

I concure however that a new and real challenge is now to be able to filter the noise induced by this overwhelming volume of available information. But I don't think nostalgia is an useful tool to reach this goal.

PS: To illustrate https://xkcd.com/1601/

(And I believe there was another comic that it traced it back to Greek philosophers that rejected writing. Thanks to whoever remember the link!)

The US is a net importer of smart people... that may change though.

Could you say the expectation of novel discovery is to blame, though? If it becomes harder to blaze new trails, but the expectation of discovery remains the same, then scientists who wish to keep getting recognition and pay must find ways to fabricate discoveries to keep up.

I would say the expectation that novel discovery is easy could be to blame. Also that the incentives discourage the process from working in a way that would allow further novel discoveries. Science needs per review and replication of results to work properly. If the incentives discourage this then novel discovery gets harder. Scientific discovery is a long-term game which means business methods are poorly suited to managing it.

As an aside, that quote is apocryphal, by the way. The actual Charles H Duell said almost the exact opposite

> People are starting to learn that the vast majority of "science" are poorly-controlled white papers that get accepted and are never looked at again unless it is by a group of replication-crazed people (or what I like to call "actual scientists") reviewing conclusions drawn from decades-old papers.

I wouldn't generalize this to all branches of science. My own experience in physics contradict that, and I also get the impression that some of the large successes we've seen recently (like gravitational wave detections) couldn't have happened if the scenario you described were the norm.

My own experience in plasma physics contradicts yours. I've been in research labs where the PIs would throw journal articles, relevant to our work, in the trash after reading them and subsequently telling us what was intrinsically wrong with the papers. These weren't papers from unknown institutions, a couple were from PPPL.

I wonder what paper it was...

My advisor and I have found some questionable stuff in PoP, for sure.

True, experimental physics, HEP, astronomy are rigorous, but them theoretical physics is bonkers in areas.

Theoretical physics must be "bonkers" -- if it was just a natural logical conclusion that would appear sensible to a common observer then there's no new theory involved.

I don't mean non-locality in QM, I mean all of the unfalsifiable crap that persists because of personal and institutional investment.

I was under the impression that gravitational wave detection had more to do with engineering advances than science per se.

The most HackerNews of all comments.

"Well it was the engineers who did most of the work..."

Who do you think came up with some of those engineering advances? No need to play semantic scientist vs. engineer arguments. Some of the advances came from scientists and some came from engineers.

I was at Caltech when LIGO was being built. Kip Thorne gave a great lecture in the engineering school about how they went about knowing what to try to engineer in the first place. I'm not an expert on LIGO, but based on the seminars I attended it sounded like there was a lot of science that went into figuring out what the engineering team needed to do.

I may have gone to a similar talk when Kip was at my school, what a coincidence!

To add to this, engineers can use the scientific method.

Engineering uses results from science to do it's magic. No transistors without understanding electricity and semiconductors.

But gravitational wave detection wasn't "just" engineering; there's a lot of theory to be done to figure out what signals to look for, how to interpret them etc.

To clarify my comment, I wasn't intending to slight the science behind LIGO, nor the scientists working on the project. Rather, I was trying to mean that the main challenge to detecting gravitational waves was "merely" building a detector sensitive enough to detect signals as faint as gravitational waves were predicted to be.

The original post here is talking about reproducible results and p-values and publishing. Is LIGO a particularly good example of that part of science being done well? I don't know either way.

I was under the impression that many advances in science took this form e.g. invent a microscope or telescope and then spend the next few decades pointing it at things and writing down what you find. Make it better, point it at smaller or further away things. And repeat.

Thank you. I hate that "science" takes credit for a ton of things that are actually discovered / perfected by engineers.

Often things reported in the press as "new science" appear to be engineering. But then I'm sure there's a deal of engineering in science labs and vice-versa.

FWIW I'm much closer to being a scientist than an engineer.

My graduate advisor told me that I should be designing experiments that had counterintuitive results that were solidly significant.

I don't see how you can look for a specific kind of outcome from an experiment without fooling yourself, or cheating.

You can if you admit the possibility of failure. Design an experiment that if it gets a significant result, that result will be solidly significant and counterintuitive. The flip side of that is that most of your results will not be significant, i.e. they will support the accepted scientific consensus. (And these should be published too - lack of interest in reproduced results is what triggered the reproduction crisis.)

It's much like with tech startups - you can design a startup such that it will either be worth billions or zero by tackling a problem that most people believe to be unsolvable but will affect lots of people, and then trying multiple counterintuitive ways of solving it. The vast majority of them will fail because that's what it means to be counterintuitive, but if it succeeds, you have a world-changing company.

Financial instruments as well - an option is a financial contract whose value will either be zero or a lot, depending on whether an unlikely-but-possible event occurs.

Sooner or later everyone will get off their collective asses and start registering studies before they are performed so that this is enforced and tracked. Hopefully.

Funnily enough, some research into parapsychology (and similar)--being, the serious research that goes into stuff barely at the edge of pseudoscience, does exactly this. Preregistration is an important part to even begin having any experiments into "remote sensing" or "extrasensory perception" taken seriously by the community. The "serious" part of it, any way. There's a bunch of statisticians who love analysing this kind of thing, and there's parapsychology researchers that love drafting up experiments that challenge and skirt the edges of experimental science.

Not nearly enough of them of course.

But still, it's kind of funny how certain fields in pseudoscience actually drive progress in statistics and scientific methodology :)

My advisor wanted me to build something on his theory on a particular system with a particular problem. I showed him it wasn't possible.

He even made a simple problem more complex just to fit his base theory. It's only about getting citations.

What field?

Before doing an experiment, you can often see that with your settings, if you'd get the counterintuitive result, then it wouldn't be statistically significant, so you'd need to plan a larger experiment or not do it at all.

You can look at any experiment as having essentially two possible outcomes - you need to ensure that at least one of those outcomes is publishable, because there are many experiments where any possible outcome will be weak.

You "need" a tentatively supporting result so you can say "more work is necessary", if you get conclusive results it's harder to get funded to work on a new topic I imagine?

It's easy to assume the worst of researchers, but I think Gelman's 2013 "Garden of Forking Paths"[1] does a good job of showing how it doesn't have to be intentional misconduct. It more likely or more often comes down to implicitly selecting your analysis plan given the data, and not correcting for the (hidden) multiple comparison problem that arises when doing frequentist statistical tests in that reality: the theory behind null-hypothesis testing and p-values dictates you account for all the other analytical choices you could have made for that data, or would have made in the face of other datasets.

Gelman's other point is this is all made much worse in inherently low-power fields like psychology and the social sciences. A field like physics can "get away with" it more because making the appropriate corrections is less likely to wildly change the conclusions.

[1]: http://www.stat.columbia.edu/~gelman/research/unpublished/p_...

One piece of advice I often give to my graduate students is "Pick problems where 'Yes' and 'No' are equally interesting answers."

>My graduate advisor told me that I should be designing experiments that had counterintuitive results that were solidly significant.

This is the advice of a true scientist. I'm refreshed to see that they're still out there.

> This is the advice of a true scientist. I'm refreshed to see that they're still out there.

It encourages students to cheat, because most of the time you won't get either. Not a significant result, and usually not a counterintuitive one either. You're pretty likely to get a significant result that agrees with existing theory, though.

By significant, you mean statistically, not to people in general.

Kind of bizzare that we don't want to report where we weren't able to find a statistically significant relationship.

Of course.

Relevant example:

> The most widely cited test was a 1987 study for Bicycling magazine by engineering professor Chester Kyle, one of the pioneers of cycling aerodynamics. He found that leg-shaving reduced drag by 0.6 per cent, enough to save about 5 seconds over the course of one hour at the brisk speed of 37 kilometres per hour. At slower speeds, the savings would be less.

> [More recent tests in a modern windtulle show] that [shaving legs reduces drag] by about 7 per cent (...). In theory, that translates to a 79-second advantage over a 40-kilometre time trial that takes about one hour.

> [The aerodynamicists in charge of the windtunnel contacted Kyle], to ask if he had any ideas about the discrepancy between the two results. It turned out that the 1987 test involved a fake lower leg in a miniature wind tunnel with or without hair glued onto it – hardly a definitive test, and yet it was enough to persuade most people not to bother with further tests for the next three decades.


It's interesting because that would have been obvious to anyone looking at the methodology section of the original paper. Unfortunately it only takes one review paper or two paraphrasing or quoting the the original before it becomes standard lore (without the methodological caveats).

Interestingly, bicyclists have been shaving their legs anyway, ignoring the scientific consensus. Maybe common sense and personal experience is underrated.

Or cyclists like to shave their legs and want an excuse? Or shaved legs gives a psychological boost (more committed), is better for injuries, or when getting physio/massages; or a combination.


It's curious to me, because in other areas it seems some degree of surface detail reduces drag (golf balls, aeroplanes too I've heard) - I wonder if short stubble is actually better and that's the effect you get using real legs. That would account for a plastic model only having a small gain.

I don't think this example generalises quite to that conclusion: as the tests show, the differences are huge, having bigger advantages than getting top-of-the-line bike equipment.

Assuming that we're already reaching the tail-end of the bell curve for human performance in cycling sport (which seems likely to me), suggesting minimal individual performance differences, then being at among that group despite a 7% performance disadvantage seems downright impossible.

So especially at the top there would be an incredible selection against anyone with hairy legs.

But I guess you could say that natural selection beats human belief.

> It's incredibly frustrating being someone who loves science, works in the field of science, and is skeptical about the system, which used to be a prerequisite and is now looked at like luddite behavior.

I'm concerned about it from a different angle:

How much public policy, medical treatment, and follow-on science is seriously suboptimal due to this house of cards?

>>How much public policy, medical treatment, and follow-on science is seriously suboptimal due to this house of cards?

This you do not want to know.

Here is a good question, the politics and bad science of which were uncovered by the documentary "Bigger, Faster, Stronger: The Side Effects of Being American" - why are exogenous testosterone and anabolic steroids strictly controlled and essentially banned?

What is your first reaction to this question? That they are unhealthy, that they have gross and terrible side effects? Likely this is what you have heard. Nearly 100% of what is popularly "known" about anabolic steroids and use of testosterone is absolutely false.

Exogenous testosterone was banned because a father of a boy who committed suicide (who was taking steroids [0]) rode a publicity wave all the way to Congress, despite the recommendation against the ban by the AMA and other medical experts.

Now, think of what else this may be true of?

[0]: The son was taking steroids, yes. He was also on benzos, drinking alcohol, and doing other recreational drugs.

> People are starting to learn that the vast majority of "science" are poorly-controlled white papers that get accepted and are never looked at again unless it is by a group of replication-crazed people (or what I like to call "actual scientists") reviewing conclusions drawn from decades-old papers.

Correct me if I'm wrong, but I thought the main issue was that papers were being replicated, but the results weren't being published (especially if they were negative). These problems were always being discovered, but the record was never being corrected.

>>Correct me if I'm wrong, but I thought the main issue was that papers were being replicated, but the results weren't being published (especially if they were negative). These problems were always being discovered, but the record was never being corrected.

You may be right, but that's basically the same thing (and as you have sort of pointed out, it's not possible to tell either way).

By replicated I would mean having the same conclusion too - a repeated experiment with opposite results/conclusions isn't a replication, it's a refutation.

If you just do the experiment again that's repetition.

So experiments are being repeated, but results can't be replicated and the results aren't being published, as you note.

The unpublished results perhaps aren't strong enough to make a refutation?

There's kind of a lot of issues. Especially with psychology and sociology and the areas they intersect.

1. Replicability of social psychology results are estimated at 25%. (Cited in article, from http://www.nature.com/news/over-half-of-psychology-studies-f...)

2. People are not reporting non-significant results. Which means we get an incomplete picture of the validity of certain hypotheses. (from article)

3. The numbers don't line up when comparing observed power and significance of results. (from article)

4. Some results data is erroneous or falsified. The GRIM test that surfaced a while back shows numeric "results" that cannot possibly be derived from the sample data. (https://medium.com/@jamesheathers/the-grim-test-a-method-for...) (https://www.vox.com/science-and-health/2016/9/30/13077658/st...)

5. Many authors on the papers don't want to share the data from their papers - even if it's a condition of publication. (from a previous article about the GRIM test that I can't find) This doesn't directly equate to maliciousness or deceit, but it illustrates incorrect attitudes in the field. Some authors consider it undignified to have someone "check their work". Others are concerned of fallout if their data contains errors. (Generic mainstream blog/press article on reproducibility https://www.vox.com/2016/3/14/11219446/psychology-replicatio...)

6. Ridiculously small sample sizes are often used in experiments. Often this doesn't seem to affect how the results are received, though it should. (briefly discussed in article and comments)

7. Papers and studies are cited in articles and future papers before they've been peer reviewed, replicated, or proven sufficiently to justify their inclusion into the field. Thus the results are "baked in" to social psychology as fact at an uncomfortably early stage. (http://blogs.plos.org/mindthebrain/2015/12/30/should-have-se...) This kind of feeds into the hype machine where people focus on significant results - even if the data needs massaging to get there. Even after being refuted (if refuted) the field and its fans never seem to fully recover and back away from bad science. By then it's already made its rounds on Facebook and it's even referenced in the worst of the college intro to psych/sociology courses.

8. Journals in general need to be more open with their data. Especially with publicly-funded research. Paywalls don't help progress.

9. Corporate/politically funded white papers publish studies without making the connections clear.

The field needs a massive overhaul in the sense that its participants need to stop chasing controversial results in the press and instead focus on reproducible results and methods. All fields need some of this, but psychology needs it more than others.

Great list. My understanding is that negative results are also under published.

Isn't the risk that your negative results harbour a poor experimental methodology, a deficiency in technique or what have you.

That's a pretty big reason not to publish a negative result, especially if the original result has social credence (big name, popular publication, wide acceptance).

Are negative results more likely to have methodological or experimental error than positive results? I've never heard that claim.

I would actually expect the opposite, as finding evidence to support a new claim should be more difficult than not finding evidence.

My understanding is that a negative result has the following format: "we did such and such experiment and did not find a significant statistical relationship between thing 1 and thing 2, after controlling for a bunch of other things".

It's worth noting that this by no means proves that there isn't a relationship, it just means that study wasn't able to find evidence of one. It could be a piece of the puzzle for a potential strong case against such a relationship, or that further research is needed to untangle any confounding factors. Which is why I think all methodologically sounds results should be published, no matter how unflashy or boring.

This is also true for finance which is heavily dependent on Excel. Very few people want to audit and verify a model given the monstrosity that some spreadsheets turn into over time.

Ha, very. My previous life as a data scientist and economist knows your comment all too well.

I used to work in medical research and I've been saying this for years. It's amazing what tenuous results get published in fields like biology and psychology, and it's even more amazing how those tenuous results get spun into concrete "facts" by a credulous public.

This is an argument for denying man made climate change. Be careful.

No, this is an argument about being skeptical about climate change effects, which is good. There are a lot of people weaponizing climate change to affect their careers in a positive way by jumping on the bandwagon.

(And yes, it is my opinion that climate change is absolutely happening. Does not mean that we can't be skeptical of the effect sizes cited.)

I just love it that everyone is saying things like:

"And yes, it is my opinion that climate change is absolutely happening" .

The level of fear in stating simple opinions has gone to 11.

I know. Unfortunately, as you have alluded to, if I don't say that, I'll be accused of being a climate change denier by somebody.

But what's wrong with that? People don't mind being called a cognitive priming denier or whatever. Treating it like an insult adds power to its effect.

It's not how one treats it, it's how one is treated when this insult is used.

No possible discussion follows. All opinions, however tame, are outcast.

What are you trying to say here? That if a fact (such as that a particular field spins tenuous research into facts - let's assume that's a fact) happens to lead us to a conclusion we don't like, we should 'be careful' and pretend the fact is untrue? That's obviously horrifically unscientific.

Or is this the point you are making, and you are expressing a climate-change-denying position, really obtusely? If so, it's on you to show climate science is 'spinning tenuous research into facts'.

> happens to lead us to a conclusion we don't like, we should 'be careful' and pretend the fact is untrue?

Reduce that to:

> happens to lead us to a conclusion we don't like, we should be careful especially given the immense political pressure both within science and in governments.

Liking has little to do with it.

Those who are purists also don't like that others question dogma.

I think it's just very subtle sarcasm that these are dangerous thoughts in a world where thought-crime is real.

Seems like we need to increase the visibility and prestige associated with disproving published results.

Right now, submitting data that contradicts a published paper is likely to be ignored, treated with hostility, or swept under the rug. At very best, you'll get a small publication of a retraction.

Disproving published results needs to be at least as prestigious as producing novel results.

If we can't swing that - it would be great for graduate student research work...

A story I heard once was that, back in the 60's, IBM was having problems with engineering quality (quelle surprise). This lead to the formation of a testing team called the Black Team. The Black Team developed a culture that prided itself on breaking systems the engineers thought were rock solid. They delighted at making engineers cry. They even started coming to work dressed in all black and sporting sinister mustaches. Consequently, many of the quality issues abated.

I don't know if it's true, but it sounds like it might be a good idea.

How might we get funding though? Perhaps with a prediction market (people bet on whether scientific hypotheses will be disproved/confirmed)? It could be fun.

Perhaps an open-access Disprove Journal?

That is an amazing and fascinating idea. It should be run on a public facing front with fact-checkers of these fact-checkers who are anonymous.

Whenever I start working somewhere new one of my side projects outside of whatever assigned work or feature work that I have is to write a test case that proves that something that was thought to be in a certain way isn't. After having done so I suddenly feel way more comfortable at my desk.

I'd love to prove things wrong for my master's thesis but novelty is a grading criterion so that's a no-go.

Completely agree. It scares me to dig into research papers and to find very obvious mistakes (anyone at a graduate level is capable of this) and even worse to see how heavily seasoned some of the horrific datasets are.

This kind of stuff despite those flaws gets propagated without much skepticism. It makes me wonder how much stuff we "know" to be true is the result of an intuitive argument but has no real support.

>>It scares me to dig into research papers and to find very obvious mistakes (anyone at a graduate level is capable of this) and even worse to see how heavily seasoned some of the horrific datasets are.

It is scary, and you are right, but you can do your part by blogging about it. Doesn't have to be official, just post the text and your notes. The more people do this, the more we have informal peer-review and replication checking after the fact, and at least a loosely-connected database of blogs and articles will exist when people search the article title.

Say it louder! As a very interested non-scientist layperson, I have learned so much through blog posts critical of poor papers, which I often find through twitter. The more people doing that, the better the state of the field becomes, faster.

It's fascinating to see the "new guard" emerging in psychology, actively acknowledging the poor practices that have preceded them and fighting back were possible.

> "Novelty" is literally written in the "guidelines for authors" sections of many journals.

A while ago I had a manuscript rejected because I used the word "novel"... so hopefully the trend of emphasizing "novel" things [1] is changing.

I'm trying to find the specific manuscript guidelines for that journal, but pretty sure they banned use of the word.

[1] http://www.nature.com/news/novel-amazing-innovative-positive...

Nature itself is to blame for this. Read their guide to authors.

The criteria for publication of scientific papers (Articles and Letters) in Nature are that they:

-report original scientific research (the main results and conclusions must not have been published or submitted elsewhere)

-are of outstanding scientific importance

-reach a conclusion of interest to an interdisciplinary readership.

"Outstanding scientific importance" screams no boring replication studies that confirm pre-existing work.

> "Outstanding scientific importance" screams no boring replication studies that confirm pre-existing work.

The note in the first bullet also forbids replication studies that confirm pre-existing work.

> (the main results and conclusions must not have been published or submitted elsewhere)

It can be interpreted that way, but usually that just means the work cannot be dual submitted for consideration or have appeared as a preprint without explicit permission.

> "Outstanding scientific importance" screams no boring replication studies that confirm pre-existing work.

I don't know, "couldn't replicate this other major study" seems like it would qualify as outstanding scientific importance. It all depends how you frame "scientific importance". Replication seems like it should be high on the list of scientific goals.

>>I don't know, "couldn't replicate this other major study" seems like it would qualify as outstanding scientific importance. It all depends how you frame "scientific importance". Replication seems like it should be high on the list of scientific goals.

You're missing the other side of it. Here's what's not interesting: Replicating the results of a study already believed to be accurate. Yet it is vitally important.

But does it need to be published in the same venue as the original research? Maybe each journal should have a counterpart for replication attempts. Then you can read about stuff you probably haven't heard before in the main publication and go look for replications in the other.

Nature is not exactly your average journal though.

What's the relation between the "scientific industry" and science? I tend to say it's orthogonal.

To arrive at the simplest truth, as Newton knew and practiced, requires years of contemplation. Not activity. Not reasoning. Not calculating. Not busy behaviour of any kind. Not reading. Not talking. Not making an effort. Not thinking. Simply bearing in mind what it is one needs to know

Uh huh. So you're going to understand cancer treatments just by thinking really hard?

Probably by contemplating the results of a double-blinded clinical trial.

I'm particularly interested in metastudies about fields like sociology which are dominated by people of one particular political ideology--I'm curious about the degree to which this homogeneity affects the quality of the work that the field outputs. I don't have a theory, but I am curious about what size grain of salt I should take when interpreting a particular finding.

You have the arrow in the correct direction. The magnitude of the effect is unknowable. But sociology/psychology/etc are highly, highly susceptible to bias and political leanings.

Is it unknowable? Couldn't you take a population of recent studies (because IIRC, the field has never been as homogeneous as it is today) and compare the reproducibility of non-political studies with political studies? Obviously some care needs to be taken with how we code "political" and "non-political", but I would think that could at least give a pretty good indication about the effect.

It would also be interesting to repeat the experiment with studies from different points in time (when there was more ideological diversity) to see how the effect varies with homogeneity. My guess is that if there is an effect, it varies nonlinearly with diversity--in other words, a field with 90% political homogeneity might have an effect that is ten times that of the same field when it was only 70% homogeneous. I suspect that once you hit 90%, dissent is much more effectively suppressed than at 70%. But maybe not?

I expect big data and AI to uncover many 'uncomfortable' sociological trends that were censored by the widely postmodernist-minded academics in the near future.

The academics will deny these hidden truths but the greater population will (and already do) exploit them.

For example?

Have you heard of Jose Duarte? His most recent blog post would be a very good place to start:


No, but thanks so much for sharing!

But if there were high academic standards, how would you manage to get enough published to meet university demands?

The lack of pure replication studies is frustrating, but I am compelled to mention that in many cases the experimental effect of the prior study often ends up becoming the baseline condition of the next study. IF the earlier experimental effect fails to replicate as baseline condition, then it isn't a good baseline in the new research, and thus fails to replicate, and thus the experiment is discontinued.

I worked as both an academic scientist and in industry.

I can attest there is a lot of crap science published. So much so that when you find a really good publication you think "wow!!"

The one thing about industry is that if the science doesn't work you don't make money. However there is little incentive to publish.

Does this novelty driven research decline once tenure has been obtained? I would like tenured researchers to take on risks, long term projects and investigations that many companies will not, regardless they pursue novelty.

That's how one might hope it works, but it doesn't. First of all, it's a lot to expect of anyone who's been struggling in a field for well over a decade (from the start of grad school to the grant of tenure) to suddenly turn around and stop doing the things (like chasing novelty at the expense of rigor) that have been key to academic survival up to that point. Second, the system doesn't incentivize that kind of behavior. If you put your grant pipeline at risk by making a big bet, you could well end up spending most of your time teaching undergraduates because academics who don't pull down lots of grant money are penalized with heavier teaching loads. (Not that this ought to be seen as a penalty, but that's another rant.)

The pressures applied to getting tenure are the same pressures that get applied to keeping your lab funded, getting your graduates and postdocs jobs, etc.

> the vast majority of "science" are poorly-controlled white papers that get accepted and are never looked at again

And worse, the vast majority of the evidence cited in major social science debates is deficient in this way.

What's the disruptive site for academic journals? I.e. There's got to be a way to ensure proper communication of positive/negative results and preferably less cruft.

Well, Sci-Hub is the start of it all. Elbakyan deserves Nobel consideration if you ask me, and I am not being hyperbolic. She has done more for science around the world than the bottom quartile of laureates.

Secondly, supporting OA-friendly journals is the next step. PeerJ is a very good initiative on that front for a lot of reasons - they open and document the peer-review process, which is unheard of. It's a tremendous asset to look through peer review and see how the "science" is really done. I highly recommend everyone find some papers that they are interested in on PeerJ and check the peer review logs.

I confess that I'm seriously considering no longer reviewing for open peer review journals. I vastly prefer double-blinding.

PeerJ is single-blinded in favor of the reviewers, if that makes you feel better. You can remove blinding at your discretion.

It is much easier to demand novelty rather than repeated validation since most academic-based research completely stops at trying to apply said research and found conclusions to a real thing. Because if they were to go and apply their findings to something in the real-world (a product, service, or some good that can be delivered and/or consumed) it would often quickly put their research to the flame, revealing either its flaws or its possible validity.

As someone who works in science what do you think is the underlying cause of this shift?

I think it's multivariate. The rise of for-profit publishers certainly plays a large role, as does tenure track limitations, the booming business of education, and just the general pushback against openness. People fear the sunlight, no matter what "science" is supposed to be about.

I don't think it's any one thing or that there's an easy fix. For my part, I refuse to publish in non-OA journals and strongly prefer those who require/strongly encourage both open data and open peer review. PeerJ is one such journal which also features very fast time to first decision AND low fees. I cannot recommend them enough.

Scoring research by reliability in addition to novelty would help fix the incentives in research publications.

Replicating research would boost your reliability score and the original's too which would boost credibility for future research.

Agreed. I'd also like to point out that the review process itself is pretty broken. People sometimes behave like sociopaths (e.g. torpedo a paper that is competing with your work), personal grudges etc.

Doesn't science also apply to science? We should be sciencing our science which means being skeptical of all conclusions, even your own. I feel your stress. I use to even call myself a scientist, and regard myself as a scientist, but then I realized 90% of my work is entirely unmeasured and there is no sense of effectiveness of the remaining 10% that actually makes it to production.

I think we need a lot more focus and funding on a field of study/discipline that studies how science it done. I'm not honestly sure who in the academy, if anyone, is doing this. "History of science" comes close (it's not always "in the past"), but generally scrupulously avoids making any determinations of the "validity" of the science, it tends to be like a _social_ study of how science it done, intentionally having no opinions on how to make it "better" or "worse".

Are there any departments or disciplines or academic careers that can be made on scrupulous scientific study of how science goes right and wrong, including statistical as well as other issues? I think we need it.

Considering that scientists are reluctant to accept scientific evidence when it applies to them, I'd say no.


I cringe reading "belief in well-supported scientific conclusions is not optional."

Belief is always optional, if you want people to believe you have to get them to opt-in to believing. Telling them they must believe rarely accomplishes this and is endemic of the type of the scientism that has displaced real science. Of course, the thing this clown was telling us we didn't have the option of disbelieving turns out not to be true. I suppose now he's willing to grant us that option. How exactly does Kahneman expect people to distinguish "well-supported" science from not "well-supported" science when he himself is obviously wholly incapable of doing so?

One good heuristic to use when evaluating scientific research is the Lindy Effect. Essentially, the longer a finding has stood without being falsified, the longer you can expect it to continue to stand without falsification.

This runs contrary to what I was taught in high school science classes, which was that newer science is more reliable than older science. The truth is that the old stuff that still stands is really where it's at. Most "novel" scientific findings are not true. A smaller portion will be thought true for a while, then discarded. Only a very small amount of research will stand for a long period of time.

Another thing to consider is that the scientific method and its modern, institutionalized implementation is not very old at all. You cannot exclude the possibility that some of our fundamental scientific understanding is totally flawed, and we've yet to discover how.

> You cannot exclude the possibility that some of our fundamental scientific understanding is totally flawed, and we've yet to discover how.

Much of modern technology is built on modern scientific findings. Since that technology unarguably works, the findings on which it is based cannot be "totally flawed".

This is wrong in a few levels. Much of modern technology is built on modern scientific findings but few modern scientific findings have technologies built on them. There are entire fields, such as social sciences, which couldn't really be said to have any technology based on them. More importantly though, technology working isn't the same as an experiment, fire worked as a technology well before we truly understood underlying physics.

I agree that soft sciences are much more likely to be seriously flawed.

That said, working technology absolutely is experimental validation of the underlying science on which it is based (if any).

It's very tricky to pinpoint what "the underlying science is." For example my phone is a piece of technology, you could argue it provides experimental evidence for any number of scientific theories. However, the hallmark of an experiment is that it can fail and in so doing disprove your theory. Can my phone fail? It can certainly fail as a phone and most phones do eventually fail for some reason or another. Does that mean that the experiment has failed and the underlying scientific theories need to be thrown out? Probably not, because it's failing as a phone, not as an experiment. What failure means when you try to consider your phone as an experiment is very unclear to me and without a clear failure case that would invalidate the theory it's useless as an experiment.

Transistors (semiconductors) work because of quantum mechanics. If it didn't work the way we expect, the CPU would be a useless piece of rock.

The GPS receiver only works properly because we understand and account for general relativity.

The radio relies on a lot of stuff related to information theory. Which might be science, depending on how you think of applied math.

The screen and battery are built on a lot of materials science, which I don't know much of anything about.


What failure means when you try to consider your phone as an experiment is very unclear to me

A phone is far to complex and intertwingled to be a proper experiment; if it works it proves a lot of things, but because of that if it doesn't work it doesn't disprove much of anything.

If the phone works, it says "all of these things are true", which is a lot of information.

If the phone doesn't work, it says "at least one of these things is not true", which is correspondingly less information.

Those things are tied together. Someone who knows more about statistics and information theory could probably explain how a lot better than I can.

Orbits of planets were calculated long before General Relativity. I believe we put a man on the moon using newtonian physics. Many types of medicine is successful without us knowing exactly why.

Just because technology that is built on top of science works doesn't mean that it is 100% correct. It isn't that our theories are outright wrong, but they probably still are incomplete or are only approximations.

"Orbits of planets were calculated long before General Relativity"

Interesting you would say that. They were calculated and the orbit of Mercury was not predicted correctly according to Newtonian Physics. Using Einstein's general relativity, the calculations matched observations.

They matched the observation that were available at the time really well. Even pre-Gallilean models matched the observations of their time.

"It works" isn't necessarily a great barometer for truth even for direct applications of theories. Ptolemaic astronomical models actually worked pretty darn well, even though they were based on a geocentric universe.


If any phone ever works, it means that some of the underlying theories might be true. Any specific phone means nothing. And a phone that isn't working has absolutely nothing to do with any proof.

Implementations of objects that rely on theories are emphatically not proofs of them. And failed implementations of those objects are not disproof.

>fire worked as a technology well before we truly understood underlying physics.

Just following the analogy; what science was fire validating during that time? Just because something works doesn't mean it wasn't created by trial and error.

In regards to your last paragraph, I'd like to mention that we made use of electricity while still having erroneous ideas about the electron. We even made use of it before the electron was theorized.

I'd also like to mention that I am a scientist, albeit retired. I'm wrong more times, by 09:00, than most people are all day. I have no reason to believe we are at some sort of apex of knowledge.

The other day, they released findings that showed they can mathematically predict quantum chaos. I'm still pondering the implications. There is so much we don't know, and that's a great thing.

To me, this is the difference between a "good enough" theory vs. a "totally flawed" theory. Obviously, there's a gray area between the two, but I think the distinction is valid. We had a working model of electricity that was good enough for some initial applications. Contrast with, say, Lysenkoism in Russia, which was totally flawed.

BTW, I never said we were at an apex of knowledge. I'm trying to quell the concern that what little we do know is somehow going to turn to mud one day. That's very unlikely.

Off topic, but mind linking the paper you mention?

Start here:


I'm going to give it a serious investigation over the winter. I don't have the time to devote to it until then.

It can work, but for reasons different to our positive scientific understanding, no?

No. The odds of, say, the GPS in your mobile phone working by accident are effectively zero.

It's not that it would work by accident, it would be more like it works by a principle to which our physics is a good enough approximation.

Yes, but then our physics wouldn't be "totally flawed". It would just be incomplete in some ways.

How large is the preimage of conclusions that would lead to the mathematics of relativity? Is it improbable in light of that?

I should also say that physics has a better track record than softer sciences. But even in physics, things change.

There's no member of that preimage in which modern science is totally flawed.

Yet we're posting in a thread relating to a massive replication crisis.

The geocentric model of the universe was equally obvious and functional to our ancestors in the context in which they used it. We are the same creatures as them and are subject to the same basic epistemological limitations -- just because we have seen further does not mean that we have seen everything. The whole concept of science is exactly to that point. It is implausible under currently available information that we could be missing some fact that would fundamentally change our understanding, but that is not the same thing as saying that there is no possible fact that could cause such a rupture.

I guess it just depends on the definition of "totally." It is very much possible to develop something based on the insights from Newtonian physics for example. Newtonian physics is flawed, but often good enough.

Right. Newtonian physics is an example of a good enough theory. Although it's been proven "wrong", it's still taught in schools and will suffice to get a rocket from Florida to Mare Tranquillitatis. No mean feat.

>which was that newer science is more reliable than older science.

who taught that? I notice a lot of people seem to think this is true. Seems obviously crazy to me! People sure aren't any smarter now.

Yes, that gave me a stop too. I mean, a prominent researcher, Nobel laureate no less, writes this, what is the conclusion? Anybody who doubts it is a quack, a charlatan, at best - and a "denier", an underminer of science, an evil incarnate at worst.

And then turns out - oops, these conclusion are not well-supported and not that scientific in fact. And those who doubted it were actually correct. Does it mean anybody who doubts anything is correct? Surely not. Does it mean we should be a bit more careful with statements like "belief is not optional" and following condemnation of those who fails to comply? I think yes.

I don't have a better way to say this. Trust me, I've tried to find one.

Science has, in many ways, taken on some of the characteristics of religion. You either believe or you're a heretic. Which, sort of, I agree with. If you don't believe in the results of the scientific method, you may wish to reexamine your life.

However, a lot of what is published under the name of science hasn't actually followed the scientific method. You get people who believe in the strangest things.

A recent example was someone who told me that the oceans were going to rise by 57' by 2050. Now, I'm firmly in the belief of AGW camp. I pointed out that they were wrong and that no models suggested such a thing and that their citation had no citations, other than an article in some newspaper.

I provided link after link. I explained that I am not a climate scientist, but a mathematician. I explained that I'd actually run the models locally, just to learn more. I explained the process, the data collection methods, and even why the data is adjusted.

They decided I was a 'denier' and a Trump voter - quite literally, they called me a 'science denying Trump voter.'

I was baffled.

Later, I'd bump into them again, at the same site. This time, I explained that science was a philosophy. Of course, I provided the citation for this and even went so far as to take the time to explain what Ph.D. actually means.

They went ballistic, so to speak. Oh, the names they called me.

This is not an isolated case, they had others chiming in. It wasn't until several days later when someone finally noticed and chimed in to support me, but it was of no use.

That person is a science believer. I can't tell them to stop believing in science. However, it has the hallmarks of a religion, it's certainly a belief system. No, being a belief system isn't a bad thing.

I don't have the answers, but this is a problem. Sorry for the verbosity, but I haven't a better way to describe it, nor a better descriptive term than religion.

>>I don't have a better way to say this. Trust me, I've tried to find one.

>>Science has, in many ways, taken on some of the characteristics of religion.

There is no better way to put it. I have said this the same way. There is a growing sentiment of a "science clergy," a bureaucracy that controls Truth (tm).

Men in lab coats have become the new priests, handing down ordained works for the masses and being held up on a pedestal.

I say this as a scientist.

It's almost as strange as how people will ask me medical questions when I'm introduced as Dr. KGIII. I can kinda understand when people ask me things outside my discipline, but I'm the last person you want to ask about your medical problems.

Maybe they just thought you were exaggerating, like many people do in reply to this insight by Kahneman. From the fact that Kahneman misjudged some studies in experimental psychology it doesn't follow that all of science is hopelessly lost, philosophical, or wrong.

To put it in another way, most people I've met who criticized science couldn't even explain how a toaster works if they had to and had no grasp of what they were criticizing. Hearing superficial criticisms by people who do not know anything about the methodology and maths can be very tiring, and these criticisms have become abundant in online forums. It can be easy to overcompensate and react needlessly aggressive to such general criticisms. At the same time, the critiques of science I've met so far (in academia, and all leftists, not that it matters) were all relying on the outcomes of obvious and constantly ongoing scientific progress on a daily basis. From lossy image compression over GPS over microchips to modern medicine and diagnostics.

Maybe in your case people overreacted and missed the fact that you're a smart mathematician and not some random dumbass who says "statistics lie" as if that was an argument for anything, and you felt butthurt about it. Still, they are right when they say that science is not a philosophy. I should know that, because I have a Ph.D. in philosophy, and the standards for our 'theories' are not even remotely as stringent as in natural science or mathematics. And maybe you're also overreacting a bit.

Contrary to what seems to have become fashionable - mostly in certain political circles outside of Europe to be honest - the scientific process continues to converge to the truth in almost all disciplines and we reap the benefits of science daily. (I say almost all because I'd like to exclude literary science and the like.)

> I mean, a prominent researcher, Nobel laureate no less, writes this, what is the conclusion?

That even smart people can be dumb.

“Some ideas are so stupid that only intellectuals believe them.”

― George Orwell

I cringe reading "belief in well-supported scientific conclusions is not optional."

Belief is always optional, if you want people to believe you have to get them to opt-in to believing.

Kahneman is describing part of a larger framework for human beings to cope with their own flawed human thought processes. You're talking about the art of persuasion.

If "you want people to believe" the most effective way to accomplish that is generally to lie to them in a clever way. That's a whole other discussion.

Belief should not be optional in the face of well-supported scientific conclusions. The problem is it's easy to mistake un-supported conclusions for well-supported conclusions.

Then the rule is "you should unquestionably believe in claims from the set A, but not from the set B, and there's no way we can tell for each case to which set it belongs. But we will always claim it's A". Not a very objective or useful rule.

Belief should always be optional. Behavior based on that belief shouldn't.

I'm fine with anyone who wants to question climate change, evolution or even something as accepted as gravity. Questioning is how we refine our thinking.

Where I get less okay is when people use their questioning to justify behaviors that, should they be wrong, are detrimental to society and the world. Question climate change all you like, but stop polluting until you've convinced the majority of climate scientists. Question evolution, but stop preventing schools from teaching it or forcing them to give creationism an equal standing until you've got something that is, at least, equally supported.

Science rarely does black and white, but you can still get a lot of benefit from altering behavior based on a scientific prediction that is 60%, 80%, 90%, 99% or 99.999% likely/accurate. Forcing belief doesn't honor that small bit of doubt that's inherent in any scientific study. But betting against what science believes to be likely when there are negative consequences to being wrong is stupid and we need to stop doing it.

Belief should always be optional. Behavior based on that belief shouldn't.

Question climate change all you like, but stop polluting until you've convinced the majority of climate scientists.

Just because they're all convinced it's completely our fault things are changing, doesn't mean that the prescribed remedies will actually work.

What if the models are wrong on the low side, and we're wasting resources trying to stop the inevitable when we should be figuring out how to survive it?

There is no real "inevitable" with climate change. More CO2 in the atmosphere will cause higher temperatures. There may be a "tipping point" where a little more CO2 will cause the system to shift dramatically to a higher temperature. But we sure don't know where that is. To give up trying to do anything about climate change (lower emissions, geoengineering, etc.) because it is inevitable we will have 2 degrees of temperature rise no matter what, is to loose track of the fact that 2 degrees would be much better 5 degrees which is much better than 10 degrees.

Part of the way we go from "un-supported conclusions" to "well-supported conclusions" is by the optional nature of belief. Because belief is optional, the proponents actually have to go to the effort of supporting their conclusions with evidence. When believing something becomes non-optional, the incentive to do that goes out the window, and conclusions generally become weaker.

To wit, it shouldn't be too surprising that the "scientist" who tells people that belief in his work is non-optional is the one who didn't bother to check if it's actually true.

>Belief should not be optional in the face of well-supported scientific conclusions.

What if I don't trust the scientists doing the work, or the larger system that they work in? What if I find the common incarnation of the entire field lacking? I do not trust much published by sociology, regardless of how many studies show results, because of the political nature of the field.

On the other hand, physicist who say they found a new particle? I don't have an once of doubt (though maybe I should have a little).

Part of the problem is that the very language used in sociology is crafted to reached the desired findings. Once the integrity of an entire field has been lost, what can be done to restore it?

I politely disagree. People are free to believe what they want. Belief should be optional. Reality doesn't much care what people believe. I'm not big on forcing people to think a certain way, nor am I sure how one would enforce such.

Good scientists have a well tuned sense of how much to doubt and believe various ideas. Some things can be determined to be true to high confidence and are most certainly true. Physicists try to get five or six sigma. For the LIGO black hole merger detection you get a statement like, "The signal was observed with a matched-filter signal-to-noise ratio of 24 and a false alarm rate estimated to be less than 1 event per 203 000 years, equivalent to a significance greater than 5.1σ." And then the physicists fight/debate over how good this estimate of uncertainty is.

Technology is created based on using facts discovered by physicists - layer, upon layer, upon layer of very precise and constant use of fundamental ideas of physics. If unexpected things keep happening, it gets investigated and sometimes "new physics" is discovered.

On the other extreme (of what some call science) there is the social sciences and psychology. A study with a p=0.05 (95%) claim to being true. Lots of problems with p-hacking and replication. Scientific experiments are either unethical or impossible.

Here is a cute cartoon about the spectrum (https://www.xkcd.com/435/), although I would title it, "Fields arranged by certainty of truth". I work in geology research where the certainty of truth of the ideas/theories we promote vary across the whole range. This can be quite fun, but one needs a very well developed sense of how true the ideas you hold are to make progress.

The subtleties on the probability of an idea being true and over which domain it applies to is why people consult "experts". Hard to choose an expert when you are not an expert.

I will believe as soon as you justify the correct answer to the demarcation problem. If you don't have an answer, then I'm free to interpret scientific results in my own way, including disbelief.

C'mon, use logic and reason.

> Belief should not be optional

What does that mean in real terms? What do you propose doing to those people who take the option of not believing? If the answer is nothing then belief is indeed optional.

disinvited to social events, placed lower on the mate selection hierarchy, opinions not taken seriously, media makes fun of them, all the normal ways culture shapes behavior.

I'm not really sure what you mean by "not optional", but my interpretation is "Anyone who disagrees with the reigning philosopher-kings will be fined, jailed, or shot".

I suppose there's ostracism for a non-state solution. So if you're a business, maybe once a year you make all of your employees measure some toads, and if they can't apply a Kolmogorov-Smirnov test correctly on the first try they're immediately fired and added to a shared blacklist.

How exactly are you handling enforcement?

More likely they'll be branded as "not believing in science" and unpersoned. But yes, this comment is spot on.

Belief is not binary.

"Science is the belief in the ignorance of experts." (Richard Feynman)

> if you want people to believe you have to get them to opt-in to believing

There's a very nice article I read recently [1], on a situation in educational research (as in, how learning happens). The authors point out a situation where a group of 30 scientists published a letter informing teachers about a persistent myth about students. They go into a beautiful examination of the impact of this, what the scientists (possibly) miss, and what effective persuasion may look like. The authors' recommendations, suitably generalized, are a great read for our times, when polarization and tribalism are high, with a startlingly low tolerance for people with “different” ideas. Highly recommended:

[1]: https://deansforimpact.org/why-mythbusting-fails-a-guide-to-...

Your mention of scientism reminded me of this good, recent article from the Australian "The Skeptic" magazine.

Skepticism, Science and Scientism https://yandoo.wordpress.com/2017/09/12/skepticism-science-a...

Have a read! I also submitted to HN at the top level. (https://news.ycombinator.com/item?id=15235291)

One approach is to never believe anything.

Belief is a problem, in my opinion, in every situation.

By definition, a belief is something you accept as true with insufficient evidence. It's kind of the opposite of knowing things.

So why believe at all when you can both think and know. Beliefs are garbage and the number 1 source of problems in this country.

agreed. why rely on evidence when we can trust the 'experts'? no thanks...

I find the response as interesting as the facts. For me, science is always "as far as we know..." and for others, they have created anchors based on a scientific result which makes them angry if that science is later altered.

I think Kahenman's response was spot on and real. And I think by responding that way he gives meaning and life to the scientific process of constantly questioning and reviewing and trying to improve.

So much about a person is evident when you challenge their beliefs. There is a range of responses but I find people who respond with "help me understand how you arrived at your [conflicting] result" get to better results than people who respond with "if you believe that [conflicting] result then you must not understand what I said, here let me explain it again in simpler terms for you." The latter don't do as well and don't have as much impact.

I really hope this comment gets found.

Barbara Ehrenreich wrote a book in 2009 called "Bright-Sided" that was incredibly devastating critique (if you managed to hear about it) about Positive Psychology. I finished the book a week ago, and I say this as someone who really likes (and still does) Tony Robbins, but it's scary stuff. The most horrifying part, although not too unexpected, is a conference near the end packed with Positive Psychology PhD students chomping at the bit to get the sweet sweet consulting jobs in the various global cities around the world, and Martin Seligman, the head of the American Psychology Association (APA), having to subtly hint that it's all a bubble, and these students realizing they might have been lied to.

I should also say that she has many concerns with Martin Seligman, another researcher of somewhat similar success to Kahneman.

In defense of all these people though, with the cuts to research funding over the years what else could they have done?

A deeper point here is the general failure of Psychology. Karen Horney, a tremendously talented psychotherapist from the 1930s, argues that true psychology is more sociology than psychology. In general this focus on the mind, on the internal, is 1). a very Western style of thinking (specifically Calvinism who were obsessed with deep introspection) and 2). Has only accelerated over the years as positive self-help think asks people to retreat further and further into the mind as a defense to deal with lower incomes, lower economic mobility, or Facebook-incited jealously.

You might find the works of Thomas Szasz to be of interest. His viewpoints are very similar to the ones you have mentioned in regards to psychiatry.


Can you elaborate the Tonny Robbin part because I like him as well? Once in a while I listen to him in my commute, some of his insights are really catching.

> In general this focus on the mind, on the internal

Layman here. Upon seeing a black box that performs tasks that are otherwise intractable, isn't it all-but-natural to desire to crack the box open and figure out how it works?

(n.b. the inability to artificially replicate the human mind's full capabilities, is one of the few reasons we continue to tolerate the many failings of the human form and mind.)

For those who like me didn't know what "priming" is in this context:

"Priming is a theory in which an implicit memory effect in which exposure to one stimulus (i.e., perceptual pattern) influences the response to another stimulus. ... For example, NURSE is recognized more quickly following DOCTOR than following BREAD."


To be clear, the DOCTOR/NURSE example is an example of "semantic priming" as mentioned near the end of the blog comment. There's decent experimental evidence for semantic priming.

The "priming" that's mostly being discussed in the comment and blog post it's replying to, on the other hand, is macro-behavioral priming, for lack of a better term: priming significantly affecting complex physical or social behaviors. Things like "seeing old-age-related words makes you walk slower" or "holding a hot vs cold beverage for a few seconds before an interview radically changes your opinion of the interviewer". The evidence here is... well, that's a lot of what the non-replicability fuss is about.

Summmary: explains context and background for a comment left by Kahneman https://replicationindex.wordpress.com/2017/02/02/reconstruc...

"I accept the basic conclusions of this blog. To be clear, I do so (1) without expressing an opinion about the statistical techniques it employed and (2) without stating an opinion about the validity and replicability of the individual studies I cited.

What the blog gets absolutely right is that I placed too much faith in underpowered studies. [...] My position when I wrote “Thinking, Fast and Slow” was that if a large body of evidence published in reputable journals supports an initially implausible conclusion, then scientific norms require us to believe that conclusion. Implausibility is not sufficient to justify disbelief, and belief in well-supported scientific conclusions is not optional. This position still seems reasonable to me, but the argument only holds when all relevant results are published."

[Edit/add] Kahneman also outlined an approach to address concerns in 2012 in an open letter in Nature see https://www.nature.com/polopoly_fs/7.6716.1349271308!/suppin... [linked from https://replicationindex.wordpress.com/2017/02/02/reconstruc...] which was apparently ignored by the priming researchers.

He goes on to say that he still believes actions can be primed.

I believe that chapter on priming Has been widely cited and used in practice. For example, law enforcement agencies use the ideology behind priming whenever they run a sting operation, and clearly believe it actually works. Look at the new prevalence of terrorism, gun, Drug, and rape stings.

are you quoting from something?

yes I put the link first before the quote https://replicationindex.wordpress.com/2017/02/02/reconstruc...

Daniel Kahneman responded to this blog in the comments section.

We've added quotation marks to disambiguate that.

Just a quick plug for one of my favorite periodicals, the Journal of Articles in Support of the Null Hypothesis: http://www.jasnh.com/ .

If you're worried about the "file drawer problem" JASNH might help you feel a little better about the future.

Naively one would think that the scientific method -- "guess that X does Y, try it, see whether it worked" -- depends as much on people telling each other what DIDN'T work as what DID. Yet many journals will refuse to publish negative results (perhaps aside from attempts to replicate significant results).

JASNH takes the opposite publication bias and gives a refreshing view into what stuff people are doing that takes up real research effort, seems to be real work, but didn't show an effect.

I have no illusions that JASNH is, like, a super impactful journal -- it's online-only and accepts a weird mix of disciplines -- but it's nice to see that people are trying to do the right thing.

While I agree with Kahneman that it is easy to fool yourself by looking at studies with small sample size, the recent focus on reproducing existing research is, to my mind, misguided. It is extremely difficult to perform experiments exactly the same, even within one lab, much less between many different labs. For a good example of this, see this fascinating description of how several labs tried to standardize the way they handle c. elegans (worms) to identify compounds which extended the life of the animals: http://www.nature.com/news/a-long-journey-to-reproducible-re...

While some effort to standardize is important, it also waste a lot of time setting up a specific set of experimental conditions that may not have much resemblance to the conditions that obtain in the real world. In my opinion, we learn much more by taking someone's existing result, thinking through the consequences and then designing well-powered experiments that probe the assumptions, mechanisms and applicability of the result. With critical eyes and diverse systems, we won't fool ourselves.

One more note: if this topic interests you, please read The Structure of Scientific Revolutions. If you are unfamiliar with the book, I guarantee it will completely change how you think about science as a human endeavour and make you much more comfortable with the existence of long periods of time where science just gets some things wrong.

I think it's important to walk before you run. If social scientists aren't able to get compatible results from the same experiment done by two different teams, then I seriously doubt their ability to extract any relevant subtle truths about human behavior.

The c. elegans story you linked is a great example of this. Getting the little details right matters. I can't imagine having any luck doing that while constantly changing the big details.

>"Hundreds of e-mails and many teleconferences later, we converged on a technique but still had a stupendous three-day difference in lifespan between labs. The problem, it turned out, was notation — one lab determined age on the basis of when an egg hatched, others on when it was laid."

Wow... this is pretty basic info on what was measured.

> In my opinion, we learn much more by taking someone's existing result, thinking through the consequences and then designing well-powered experiments that probe the assumptions, mechanisms and applicability of the result.

I think that, too, is considered replication by many.

The replication projects I am specifically referring to are similar to eLife's "Reproducibility Project: Cancer Biology": https://elifesciences.org/collections/9b1e83d1/reproducibili... http://www.sciencemag.org/news/2017/01/rigorous-replication-...

This is another vote for The Structure of Scientific Revolutions.

Such an incredible book.

> This does not mean that we can trust the published results, but it does suggest that some of the published results might be replicable in larger replication studies with more power to detect small effects. At the same time, the graph shows clear evidence for a selection effect.

So, there might be a priming effect but the studies that Kahneman used don't necessarily show that? Is that right?

"So, there might be a priming effect but the studies that Kahneman used don't necessarily show that? Is that right?"

Perhaps more importantly, there still might not be.

Part of the problem is that priming isn't a binary thing, but a range. Some uses of things that could be described as priming are so well established that even if they are not "science", they are certainly engineering, inasmuch as marketers successfully use them routinely. On the other hand, studies that seem to show that if you flash words of negative connotation faster than they can be consciously read (or possibly even consciously seen), pictures conforming to stereotypes associated with those words are slightly more quickly recognized may turn out to be bunk they intuitively are after all. (Note that I'm not saying they're bunk because our intuition says they are. But contrary to what seems to be a somewhat popular belief, it is in fact possible for our intuition to be correct. It's one of those things where you only ever hear about where it's wrong, precisely because that is in some sense news. It's right quite often, moreso for one trained on existing science.)

Personally I'd say this is one of those cases where the recent Nature proposal to up the standard of significance from 0.05 to 0.005 would probably have been helpful. https://news.ycombinator.com/item?id=15192610 If implemented it wouldn't solve everything instantly, but it would certainly raise the bar on this sort of side track being taken.

Yes. Essentially there is some qualitative evidence that there is a priming effect, but the quantitative evidence is insufficient given the sample sizes collected by the cited studies.

Honestly, the fake Nobel prize for economics really diminishes the credibility of the Nobel prizes and science in general. I like economics, have studied it and follow it, but it does not deserve to be placed alongside the hard sciences where actual irrefutable progress is occuring.

Nobel prizes are awarded for Chemistry, Literature, Peace, Physics, Medicine, and Economics.

Cultural progress does not hinge on advances in hard science and the Nobel prize is wise to understand that.

Failing to recognize the impact of literature, peace and economics on our society is a failure to understand the entire purpose of the award.

I think economics itself is very important for society. I just don't think economists add much to the field.

Peter Nobel, a human rights lawyer and great grandson of Alfred Nobel explained that "Nobel despised people who cared more about profits than society's well-being", saying that "There is nothing to indicate that he would have wanted such a prize", and that the association with the Nobel prizes is "a PR coup by economists to improve their reputation".

So often I read some bullshit article with a headline touting the author as being a nobel prize winner, and it is always the economics prize. It debases the achievements of other nobel prize winners, both in hard science and in literature/peace.

Even Hayek himself was against the Nobel prize for economics because: "The Nobel Prize confers on an individual an authority which in economics no man ought to possess.... This does not matter in the natural sciences. Here the influence exercised by an individual is chiefly an influence on his fellow experts; and they will soon cut him down to size if he exceeds his competence. But the influence of the economist that mainly matters is an influence over laymen: politicians, journalists, civil servants and the public generally."

Alfred Nobel was an arms manufacturer, that's literally putting profits about society's well being.

And as the story goes, after reading an obituary that argued exactly that ("reports of my death are greatly exaggerated" kind of situation), Nobel drafted a peculiar testament about how his fortune should be used after his death. [1]

>The whole of my remaining realizable estate shall be dealt with in the following way: the capital, invested in safe securities by my executors, shall constitute a fund, the interest on which shall be annually distributed in the form of prizes to those who, during the preceding year, shall have conferred the greatest benefit to mankind. The said interest shall be divided into five equal parts, which shall be apportioned as follows: one part to the person who shall have made the most important discovery or invention within the field of physics; one part to the person who shall have made the most important chemical discovery or improvement; one part to the person who shall have made the most important discovery within the domain of physiology or medicine; one part to the person who shall have produced in the field of literature the most outstanding work in an ideal direction; and one part to the person who shall have done the most or the best work for fraternity between nations, for the abolition or reduction of standing armies and for the holding and promotion of peace congresses.

(You will also notice that the Economics prize is a later addition, having nothing to do with Nobel's will. I've always wondered why the Swedes picked economics as the only field worthy of being added as a prize category despite not being mentioned in the original testament.)

[1] https://www.nobelprize.org/alfred_nobel/will/will-full.html

Which society, the customer's or their mortal enemy's?

Your defense of economics include literature and peaces. Both are original prizes (unlike economics) and weren't mentioned by GP.

I recognize impact of literature and peace prizes. I don't recognize such from economics.

Name two economics Nobel winners you disagree their awarding and the reason why they don't merit it


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact