Hacker News new | past | comments | ask | show | jobs | submit login
If you think psychological science is bad, imagine how bad it was in 1999 (columbia.edu)
275 points by ruaraidh on June 16, 2021 | hide | past | favorite | 228 comments



When you think about it, it's insane that metascience isn't studied more. We're going to question every little detail about the universe, but we're just going to take on faith that peer-review and publication in journals is an effective method for weeding out "bad" science?


> it's insane that metascience isn't studied more.

It is, probably more than is apparent. See:

https://en.wikipedia.org/wiki/Philosophy_of_science

https://en.wikipedia.org/wiki/Sociology_of_scientific_knowle...

There is an interesting recursive problem here, though: what tools do you use to scientifically analyze the scientific process? Whatever tools you use will themselves be hobbled by the same systemic flaws you are trying to understand.

Also, as any sociologist will be happy to tell you, incentive structures and other human group behavior gets in the way. It's probably hard to get funding for a study that shows that all the other departments at your university aren't quite the flawless seekers of truth they appear to be.


Indeed, and if you think science is bad, imagine metascience.


When you think about it, it's insane that metametascience isn't studied more. We're going to question every little detail about the scientific process, but we're just going to take on faith that peer-review and publication in journals is an effective method for weeding out "bad" metascience?


Indeed, and if you think metascience is bad, imagine meta-metascience.


Meta-metascience has been studied _extensively_ by previous generations, where what this generation calls "science" was called "natural philosophy" and interesting discourse and study was had on such things as:

* The study of knowledge (epistemology): https://en.wikipedia.org/wiki/Epistemology

* The study of existence and what exists (ontology): https://en.wikipedia.org/wiki/Ontology

* The study of the purpose of things (teleology): https://en.wikipedia.org/wiki/Teleology

It turns out that these are hard areas of study and it requires a lot of properly focused leisure time to understand properly. Most people don't consider these things, wing it, and wind up working with a half-baked meta-meta science of their own creation ... oh, I see what you mean :-D


Where's tail call optimization when you need it?


Would be nice if tail call optimization solved the halting problem.


You've caught an Orobus by the tail!


I might be biased as someone who studied philosophy. But I wish Philosophy of Science was mandatory in more science degrees. A lot of scientists don't seem to be familiar.


Can you point to something good and useful in the Philosophie of Science? I mean let's say physics is as good as it gets (laws in mathematical language, strong experimental evidence, solid results); from my perspective a success. But it has nothing to do with how like popper imagines science. So insisting on like a popper process would also not be benificial for physics in my opinion. But probably there is better stuff, I just couldn't find it. (with a popper mindset nobody would understand why we still study 'falsified' theories like electrodynamics)


Some key things for me would be the ideas that:

1. There is no such thing as objectivity in inquiry. There is always a scientist interpreting things, and they are always looking at things through the lens of their own biases and cultural norms. The best you can hope to do is to be aware of your limitations.

1. Statistical evidence on its own is not always a good basis for believing something to be true. Typically you also want a good theoretical model, and an understanding of the mechanism that underlies the observed phenomena (physics is good at this, psychology not so much).

3. That there is more knowledge than is detectable through statistical methods (currently at least). And that lack of evidence from statistical studies does not necessarily constitute good evidence that a theory is false.


1. This is pretty trivial. Why do you think most scientists aren’t aware of their limitations?

2. This is studied in statistics classes. It is impossible to analyze data without a statistical model, so of course your conclusions rely on it. So this is also quite trivial.

3. Again, are scientists oblivious to this simple notion? I doubt it.


Why is it that mirrors flip your image left to right, but not top to bottom?

That’s a question that Richard Feynman supposedly asked his grad students. Once you answer it, I think you’ll realize that #1 is anything but trivial.

Answer below, stop reading this comment if you want to figure it out on your own.

It’s because you’re comparing the mirror image to what you’d look like if you walked around the mirror, instead of to what you’d look like if you floated over the top. This assumption of horizontal travel is incredibly deeply engrained in humans, to the point that the English language doesn’t even have up/down equivalents to the words “left” and “right”, i.e. a word that means the direction closer to your head than to your feet regardless of your orientation.


A mirror doesn't swap left and right, it swaps front and back.

The realizations that people expect yaw rotation because that's what they are used to, or that normal day language isn't always precise (my left? your left?), seem extremely trivial to me and doesn't require any deep philosophical "insights".

In the hard sciences, you often don't even need the handwavey "swap a and b" explanation when it is much more useful to just model its behavior (ingress ray/plane intersection, normalize ingress ray, subtract twice normal vector of plane from ray to get egress ray direction).

I'm sure Feynman was a great physicist and teacher, but he's also great at just wowing people by using lots of words without saying much. Like in his famous why-is-ice-slippery video where he goes onto a completely unnecessary discourse on the nature of questions instead of just answering the dang question.

I always felt that is the perfect way to showcase the difference between education and edutainment, and I assume his lectures were a bit more substantial.


I think you’re certainly correct, but remember that the question wasn’t just about understanding mirrors, but using mirrors to make a point about bias. “A mirror image is swapped left to right” doesn’t appear to be biased at all, and yet it is.

Also, just for fun: what makes calling it swapped front to back more valid than left to right? One could recover the (translated, rotated) original image through any of a front to back swap, left to right swap, or top to bottom swap. They’re all lenses, and equally valid.


https://en.wikipedia.org/wiki/Anatomical_terms_of_location

I guess technically correct, because they're mostly latin loan words.


I didn’t know that terminology, thank you.

But reading the article, are those not applicable only to body parts? Would an astronaut ever say to another astronaut, while floating in different orientations, “Hit the button to your superior”?

Either way, I think it’s at least correct to say that there are no layman words for it.


I fail to see how it in any way relates to #1.


A lot of this stuff applies more to bad scientists, of which there are many.


LudwigNagasena, to you it may be obvious ( as it was to me). However I discovered ( much to my disappointment) that it is not obvious to career 'scientists'. A fool with a phd is still a fool. What used to the domain of primarily religions (rituals, superstitions etc.) is now the domain of several offshoots, one of which is mainstream science.


Is this coming from judging peoples actions or discussion with acquaintances and colleagues, or in discussion with trusted friends who are career scientists?

My understanding is that a lot of problems in science are incentive based, and that would look very similar to someone just not knowing things from the outside.


Mostly based on the 'science' out there. I probably know about 3 career scientists, I think, which is probably not a good sample.

>My understanding is that a lot of problems in science are incentive based, and that would look very similar to someone just not knowing things from the outside.

I can agree that a lot of it may be probably incentive based vs 'scientists' just not being smart enough. But is is hard to glean what the proportions are. I still suspect ( based on a general assessment of people both known and on the internet) that it is the latter.


> Can you point to something good and useful in the Philosophie of Science?

Every scientist should be able to answer the particular tenets that they take on faith.

Scientists do take on faith that the universe is causal and that the rules today are the same as the rules yesterday.

Yes, scientists double check these assumptions over and over, but they can never "prove" them.

This kind of introspection is important for science to set itself apart from religion, for example.


I meant more like book, or a philosopher.


Yeah I think the fundamental distinction is that metascience is probably at its core going to be philosophy, but that doesn't have to make it any less rigorous. And I agree, Philsoophy of Science being included in more science degrees would be great. One of the most interesting courses I took in my undergrad.


It's not that useful in practice though, for many fields a bit more ethics is a much more pressing issue.


>what tools do you use to scientifically analyze the scientific process?

Iterating on existing systems to see if you can get results to converge and also testing new systems to see if they also result in known good values.


It's fun to realize how deeply ingrained the current scientific process is in our way of thinking.

All of these ideas that you tacitly take for granted are itself mutable parts of the scientific process:

* That iterative improvement and hill-climbing is an effective process for improving results.

* That replication of experiments and convergence is a truth-generating enterprise.

* That truth can be expressed numerically.

* That there are some values that are "known good". By what process? According to whom?

To be clear, I don't disagree with those. However, these rules aren't baked into the firmament of the universe. They are processes we humans have chosen to apply in our social process of reaching conensus on truth. In other words, this list here isn't physics, it's technology.

It's entirely possible to imagine a culture whose truth finding bodies don't take for granted one or more of these rules at all. That culture might be more or less effective (again, according to what metrics?), but it would still be well-defined.


> That iterative improvement and hill-climbing is an effective process for improving results.

Not essential to science; in fact, there’d a major viewpoint within metascience that explicitly rejects this as popular mythology of how science works in practice, holding that models change by revolution more than evolution.

> That replication of experiments and convergence is a truth-generating enterprise.

Not part of science, in the same way that your use of “truth” later is not.

> That truth can be expressed numerically

No. While scientism may make essential connections between science and truth, science itself only depends on useful predictive models being expressable, not about truth being expressable, numerically or otherwise, or even being a coherent, meaningful concept.

> That there are some values that are "known good"

That's not only not essential to the scientific process, but contrary to something that is: that all results are contingent.


Yeah, I'm reminded of Zen and the Art of Motorcycle maintenance where the author was driven to insanity by the quest to define "quality".


> That culture might be more or less effective (again, according to what metrics?)

Isn't this the idea of "free will". That you get to choose for yourself the metrics you want to optimize your life for?

Now I think that you'll use a combination of learned and inherited desires for it. But the idea here is that each individual can express those desires, and then the "success" of a society is thus to maximize each individuals success, even when they differ in their metrics.

That's a made up concept as well, but I think it still stems from individual desires. We've just mostly all individually observed that an organized society that compromises with each other to maximize each and everyone's individual desires has less risk to our own desires being squandered.

The alternative would be to try to achieve power over others to maximize your desires, and maybe from history and life experience, people have found that to be not sustainable or only achievable for a few, thus your chances at it are lower.

In essence, I think I'm saying that it seems over time people know their desires, but don't know how to beat fulfill them, and this is the metric.


> That truth can be expressed numerically

I don't think this is what is held by those-who-do-science-and-philosophy-at-large (though it may be a generally accepted hand-wave, I don't know). See, for example Category Theory for a branch of what-I-believe-would-generally-be-called-science that doesn't use numbers, but instead expresses things with sets and relations.

The logician is the intersection of the set of all scientists and the set of all philosophers.


I think I disagree, since mathematics cannot be reduced to logic (or any other instrumentalisation).

https://plato.stanford.edu/entries/philosophy-mathematics/#M...


That article says that mathematics is ill-founded, not that math has something inexpressible by logic. Mathematics can be reduced to logic that is ill-founded.


The only nit I'd pick with the list is:

* That truth can be expressed numerically.

Isn't the basic point of quantum physics that this isn't true? We can only make guesses with probabilities, but we can't know the actual truth, and therefore can't express it numerically.


Sure, but the probabilities are numbers too. Which is again sort of acknowledging the need to fit quantum mechanics into a numeric framework.

Imagine you were studying ice cream flavors. You might design a study like, "We'll ask a lot of people and the flavor that the most people prefer is the best." In other words, the metaprocess you use to design your experiment itself tacitly assumes you need a numeric result. The presumption of comparison and quantifying frames the questions you even think to ask.

But you can imagine an alternate culture that when studying ice cream flavors doesn't even ask questions with numeric answers. It could be, "We'll ask a lot of people to try flavors and write poems about the experience."

We wouldn't even call this "science". Because there is a hidden border around even the term that affects how we are able to evolve the scientific process.


Funny thing about quantum mechanics... we use numbers to describe probabilities and functions to describe probability distributions. But the mathematical models we use are incomplete (and, looking at neutron decay, incompatible) so can we really conclude that numbers are the right abstraction?

And no, not poems. Elements of finite groups are not numbers, but they crop up in physics frequently. Topologies are not numbers, but they're also significant in physics.


That seems broad enough to describe the human history isn’t it? So, “keep doing what you do”?


Not necessarily. You could just fall back on authority, superstition, or navel-gazing.


That would probably fall under "testing [new/other] systems to see if they also result in known good values."


> what tools do you use to scientifically analyze the scientific process?

Engineering. If you can build something that works based on the rules theorized by scientists, they are on to something. e.g. building a skyscraper proves we know the properties of steel to a pretty good margin of error.


For many sciences, like psychology, that's not normally an option.


Psychology is basically applied neuroscience. They are both in their infancy and we are using the best information we can, but I wouldn't call the body of work comparable to other sciences.


Which raises the question why we study this subject at all, if it's not actually applicable to anything beyond making viral TED talks.


Yeah so it is not science but rather philosophy and it is called Epistemology and not metascience. Even though I suppose metascience would be an apt description in a way.


"People" sciences like sociology, psychology and economics can make incredibly misleading claims because one experiment over a small sample of people at a certain moment in time might seem to support a claim, while the actual reason for the observed results is a factor which is never taken in consideration. On the other hand, conducting those experiments over wider demographics and in different points in time means that the study wants to build "universal" models of how each single person in the whole world acts, which is utterly dismissive of the specific local environment around people.

Sociology in particular should always be approached highly critically, because applying those theories and reasoning in its terms often means mass control over people's free will.


I majored in psychology in undergrad. A big part of why I didn't look for a psychology focused job is that the science is all so loose. I'd often learn about two different study-backed phenomena is two different classes that somewhat contradicted each other. Or I'd learn in a subsequent class that a previously taught study has been invalidated in one way or another. Almost everything is measured subjectively, so huge parts of our knowledge of psychology are a house of cards resting on assumptions that the diagnostic questionnaires used to measure are accurate and reliable. Many of the measured effects are small, and so it's hard to trust that randomization and controls are sufficient. Replication of results is a major issue.

It all just feels so 'loose' compared to the physical sciences.


The important problems are hard. Avoiding psychology because it's messy is like the metaphor of only searching for your lost keys under a lamppost.


Think about how loose medical science used to be (and for how long)! leeches, bloodletting, miasma, ridiculous enemas and all sorts of outright nonsense. We've got a lot more mistakes to make, but social sciences will improve too.


To be fair, no one was using statistics and the scientific method to support bloodletting, or miasma theory.


No, but let's not forget that much of statistics was originally invented to provide a rigorous underpinning for eugenics. Pearson, who invented many of the most commonly used statistical results, was a prominent eugenicist and contributed greatly to its ideas.


That doesn't mean that statistics aren't a useful tool for checking the strength of evidence, or building a case for a hypothesis.


Sure, but it doesn't change the fact that your findings will be as useful or useless as your premise. If you're trying to use statsitics to prove pseudoscience, well, it's still pseudoscience at the end of the day. (NB: I don't mean that psychological sciences are pseudoscience.)


Honestly all sociology I have seen or been exposed to, including in college, seems to be more interested in acting as a platform to push specific ideas, rather than an attempt to find truth.

Beyond that those involved in sociology seem to believe that a study is the same thing as an experiment and like to believe that constitutes proof.

Ultimately we can't really run AB experiments on society at large because we are living in; however humanity has at its disposal all of history as a case study. My point is if you really want to understand how societies interact and form, and react, and live ask a historian, not a sociologist.

I also would apply most of these comments to economics except there seems to be more diversity of viewpoints, and studies are used less than math to try and provide a veneer of respectability.

EDIT:

If someone feels that history is inferior to sociology for understanding how societies act and behave please tell me why. I want to understand where I am wrong. But I see a lot of our arguments that we are having in society nowadays the same as one's had a thousand years ago, the discussions over Social Media are basically the exact same ones people had over the printing press in Europe, I recently read "The Republic" and there were the exact same arguments I see repeated here.

So if you feel contrary please tell me why, I admit I could be wrong, but want to understand where my reasoning is flawed.


I think you are naive about history.

I'm an economist. If I threw away the half of the data that didn't support my findings, and got caught, I'd lose my job and never publish again. I'm pretty sure the same is true in other social sciences, such as psychology. This is true irrespective of the well-documented problems that the article describes, which certainly also apply in economics and elsewhere, to varying degrees.

By contrast, when historians are caught cutting sentences in half to prove their point, they don't lose their jobs. They don't even lose their Pulitzers: https://davidhughjones.blogspot.com/2020/07/can-we-trust-his...


But let's not pretend that historians influence the policy makers as much as sociologists and economists do either.

The damages when they are wrong are orders of magnitude bigger.

They are assumed to be right, sometimes even without proof, until they are tragically proven wrong.

And nobody lose their job anyway.

Have you ever seen a sociologist lose the job because proposed something to a politician that resulted in lots of people having their life ruined?

I never did, honestly.

Have the last three more recent economic and social crisis been caused by historians mistakes?

https://familyinequality.wordpress.com/2017/10/02/sociologys...


I think you underestimate how influential historians are in the long run, by changing how we see ourselves. But in any case, my point was about which disciplines we can trust, not about which are more or less powerful.


Economic policy is implemented by the parliament and the executive branch. It rarely closely follows advice by the economists. Even the Fed chair is a lawyer!


The thing about textual evidence is that you can't cite an entire text (obviously). You have to selectively choose what to quote in order to support your claims. Additionally, people can write one thing, and then write other contradictory things. Or they can act in ways that contradict what they write. It is from this totality of evidence that non-quantitative methods draw their conclusions. To get to the point, I'm not necessarily claiming that Nancy Maclean (the historian "caught cutting sentences in half") is in the right here, but if you actually follow the debate it seems quite nuanced and the internet critic hadn't actually even read most of the book they were criticizing (and also clearly has certain political leanings to boot). Certainly nothing like "throwing away half the data that didn't support my findings."


The JEL review which I quote in the linked blog certainly had read the book, and called it "replete with significantly flawed arguments, misplaced citations, and dubious conjectures". And if cutting sentences in half, to remove something which directly contradicts your thesis, doesn't count as historical malpractice, then what would?


It's just not that simple. I'm not making a value judgment on the book (I haven't read it), but a person can say two things in the same sentence and the broader context can make it clear that they're just covering their ass, for example. Perhaps that's not what's going on here. Perhaps the book does constitute "malpractice." But...I think the situation is more complex than you're giving it credit for, and I wouldn't be comfortable drawing conclusions without a greater familiarity with the book and the responses to it. I also don't give a lot of credence to the blog you linked, since they use (as one of their two pieces of evidence) a critique which openly admits it hasn't actually read the thing that is being critiqued.

To your last point, plagiarism, for example, definitely counts as malpractice and humanities professors lose their jobs for it.


After following up on your sources, I surrender my position. It appears that the quest for truth has largely been abandoned in academia, and that integrity is a fools dream.

We truly are as T.S Elot said the hollow men.


Let's not go overboard now....


I don't disagree with you, but frankly would be a bit frustrating to limit one's studies of human behavior to just history without trying to understand the dynamics of current societies, trying to understand how they respond to change and so on. Both fields have a completely different set of instruments and very limited overlap.


The best social science studies involve often accidental experiments, where good experimental conditions occur not because of design but because of happenstance. The analysis of these situations could be construed as a historical case study, or it could be construed as an experiment. I agree that seeing analogues in past societies is not the best approach, but studying history can sometimes reveal experiment-like conditions.


Another similar issue is with data from situations that would be clearly unethical to intentionally create. Behaviour of plane crash survivors standard on mountainsides, castaways, feral children, etc


I felt this way as well. But you might benefit from reading more old school sociology books.

C. Wright Mills The Sociological Imagination is great (should have been taught in college to you). Thorstein Veblen's Theory of the Leisure Class is good as well. These really seemed to me like attempts to approach truth, and perhaps that's because of the time they were written in vs the time we live in now.


"On the other hand, conducting those experiments over wider demographics and in different points in time means that the study wants to build "universal" models of how each single person in the whole world acts, which is utterly dismissive of the specific local environment around people."

I don't think building "universal models" or observing recurring patterns through analysis of 'experiments over wider demographics and in different points in time' require the ambition to predict a single individual behavior or actions as a corollary.

The problem lies - like you said - with the policymaker. And well more generally with people who extrapolate the results of a paper inadequately.


The problem is even more pervasive than that. There is an irresistible tendency to try to make universal statements rather than just sharing anecdotes and not generalizing from them.

Like, for example, I just made two universal statements, didn’t I?


Yeah so you have to make a difference between empirical science and science here really. Which Max Weber who was one of the pillars of social sciences stated around one hundred years ago.

"As such, he was a key proponent of methodological anti-positivism, arguing for the study of social action through interpretive (rather than empiricist) methods, based on understanding the purpose and meanings that individuals attach to their own actions."

https://en.wikipedia.org/wiki/Max_Weber

edit: This is then further developed by the so called Frankfurt School as Critical Theory.


Alternatively, anti-positivist endeavors should find themselves another space to occupy and not piggy-back on an institutional adjacency to actual sciences to posture credibility, authority, attain public funding etc.


On the other hand one might also argue that positivism is just a school of philosophy and that positivists are piggy backing on a thousands of years old tradition of philosophy.


>> thousands of years old tradition of philosophy

imagine those proponents of 'thousands of years old tradition of philosophy' would try to accomplish anything with it. how funny that would be..


I saw a chart once that made the point that most research studies can be classified along two axes: rigorousness of methods and popularity of results. Most published papers have either high rigor/low popularity and low rigor/high popularity. The trade-off is in the fact that highly rigorous studies only allow for narrow, unexciting results, while popular studies with flashy results will have to compromise on their rigorousness. This is not really a rule, but it is an interesting way to see research and the editorial/peer-review process.

The example discussed in OP seems to fall in the category of low rigor/high popularity. I am not 100% on my history of psych research, but it seems to me that the stereotype threat was all the rage in the late 90s following the publication of Steele and Aronson (1995). OP study seems to follow a similar experimental setup as S&A with a new group of people (Asian-American women).

As far as meta-science is concerned, I think that it remains mostly a part of philosophy (as in epistemology) and the focus of a few (senior?) scholars in each field. There is really no space to publish meta-scientific papers that "shake up" the field and call out established researchers, as editors that publish those pieces could come under similar criticism for their work. I think that it is not an accident that the discussion of the replication crisis in psychology started from blog posts and other non-academic avenues and then found its way to more "established" publications in the field (again, if I remember the context of those conversations).

I really wish that the review process was open. It would be interesting to see the reviewers comments to this specific paper and how the editor decided to pick up and engage with them. All those conversations are usually locked up in some editorial management system and are seldom made public. I don't know if we can really have open science without having open peer reviews.


I agree with you mostly, but it's worth noting that modern meta-science had its blossoming in psychology in the 1960s, with the development of meta-analysis (with educational psychology and clinical psychology). Technically the origins are much earlier, in the 30s(?) in statistics, but as a field I think it took off around that time, and spread.

Similarly, the replication crisis was being discussed in a lot of areas, especially in psychology, throughout this time, but was largely ignored until after the Bem ESP study. Registered replications aren't new, nor is concern about meta-science; it's just had renewed focus in recent years for various reasons.

It's not all that surprising to me that meta-science is associated with psychology. After all, not only is psychology often sort of fuzzy (by necessity of its subject matter), but it's the science of human behavior, which I think can lay claim to scientist behavior as well.

I think it's arguably the greatest contribution of psychology to the sciences in general.


>When you think about it, it's insane that metascience isn't studied more. We're going to question every little detail about the universe, but we're just going to take on faith that peer-review and publication in journals is an effective method for weeding out "bad" science?

In a sense we do have this: engineering and finance. Engineering turns good hard science into new tools, machines and weapons, and Finance turns good (predictive) soft science into new ways to make money.


> In a sense we do have this: engineering and finance. Engineering turns good hard science into new tools, machines and weapons, and Finance turns good (predictive) soft science into new ways to make money.

I think this is a common critique, but I also think it is missing the point. What if the question of interest isn't so easily verifiable like in Engineering? Do we just throw up our hands and give up on those questions? [The alternative to good social science is not no social science, it’s bad social science](https://statmodeling.stat.columbia.edu/2021/03/12/the-social...).

Finance is also a bit tautological in this regard. It seems that often prediction models are impossible to disprove (e.g., our arbitrage method doesn't work anymore, the market updated). Yes good for putting skin in the game, but doesn't seem like it does much to advance our long-term understanding of humans.


>What if the question of interest isn't so easily verifiable like in Engineering? Do we just throw up our hands and give up on those questions? [The alternative to good social science is not no social science, it’s bad social science](https://statmodeling.stat.columbia.edu/2021/03/12/the-social...).

Some things may well be complex enough that it's simply impossible, with the amount of resources available to the average university, to conduct a thorough enough study on a representative enough sample that accounts for enough confounding factors to make a statistically sound prediction that generalises. If this were the case for a significant proportion of the subjects of study of a particular field, then it might well be better to "give up" and admit we don't and cannot know, otherwise we're essentially creating a factory for bad science (as the available resources relative to the scope of the problem aren't sufficient to create good science, and there's no negative feedback to stop the bad science).


If there's "good social science" that can't be used to make predictions, what differentiates it from bad social science?


Its utility [1]. The social sciences study a lot of things that people in group A intuitively understand that group B can be completely ignorant of - say, for example, how to navigate a complex social structure like office politics in a modern workplace. Making any sort of predictions about intangible outcomes where the Hawthorne effect is in full effect is pretty much impossible since group A will respond to the new knowledge gained by group B, in effect changing the system we're trying to predict. Individual's psychologies respond to the changing psychology of the group in nondeterministic ways (at least, relative to our ability to collect data on input variables and internal state).

We can bikeshed what makes something a "science" till the cows come home but the philosophy of science and epistemology were not settled with Bacon and Popper - the end goal has always understanding in the broadest sense. Those studies have value as long as they help someone make sense of and adapt to the social systems they're in. It does mean though that those studies should be approached with extreme caution (see the decades wasted on string theory) and anyone basing their research off past results needs to carefully validate their assumptions.

[1] I think in this case "predictive" as a scientific term of art is too restricting. Social sciences often deal with very personal interactions that appear nondeterministic at the scale of a society but are relatively predictable when applied to a stereotypical office or school setting.


I don't understand what difference you're trying to make between utility and predictive power. If you can give information on what approach in general will be better to approach office politics that is just a prediction. It doesn't mean that these predictions have to be always right, but if they don't have predictive power and are no better than a coinflip, that "understanding" is just a post-rationalization that doesn't provide any utility at all.

At the very least, it seems to me like the person I originally responded to would also disagree with judging social sciences for its "utility" - the article they linked specifically contrasted it with the natural sciences that "solve problems".


For something to be predictive in a scientific sense, it has to be repeatable. There is so much variety in individuals and their environments that most social sciences have little repeatability - they mostly study affluent western college students who have time to volunteer for college psychology studies. However, if you're mostly an affluent western college student, chances are that you can take some value out of the studies because they're selected for your environment rather than humanity as a whole (which is what they purport to do by claiming to study 'psychology' rather than western college students specifically).

Closest analogy off the top of my head is psychiatric drugs: their efficacy is generally bottom of the barrel except for some group with factor X (each drug has their own unique factor X). For the vast majority of these drugs, we have no method of screening for whether a person has factor X - we don't even know what it is most of the time - so doctors have to go through a process of trial and error with patients until they find the right drug or combination. Once they do, it's like a night and day difference for the patient, yet if we applied the same standard of evidence for psychiatric drugs that we do for blood pressure pills, we'd never make any progress. A lot of the drugs look like they don't work in phase III and we have no way to predict which drug which help which patient but the patients figure it out with their doctors because they have actionable data, even if it isn't predictive in general.


I think this is an important point, ultimately good science will produce, verifiable, testable, actionable results. Until you have that no matter how much math you use, how many lab coats you've got, no matter how many journals you publish in you're just sitting there playing with strings.


Telling when this has happened can be less trivial than you would assume though. People (scientists even) were sure that phrenology produced verifiable, testable, actionable results for a generation or two.

In the long run it usually comes out, but the run can be longer than you think, and you may not be where you think in it with regard to any particular current theory. I wonder what things we know all "know" are proven by science will be dismissed by later generations. (I personally guess a lot of genetics-related stuff will be).

(Note that something doesn't need to be verifiable, reliable, or true to be "actionable". You can act on anything...)


A good start would be something like Probability Theory: the Logic of Science By E. T. Jaynes: http://www.med.mcgill.ca/epidemiology/hanley/bios601/Gaussia...

Also, the difference between a bad method and a good method, is that the good method makes more accurate, better calibrated predictions (that is, using it makes us better gamblers).


It is, but you arent aware because metameta science isn’t covered enough


There was a past discussion about peer review on HN: https://news.ycombinator.com/item?id=20607259

Another related discussion was about the grievance studies scandal, which also touches on peer review and academic rigor in journals: https://news.ycombinator.com/item?id=18127811


It isn't so much that it is not applied as that the pressure to generate data inside academic science makes it extremely difficult for grad students and postdocs to allocate the time to doing science on their experimental processes. They already only barely have time to do experiments on the system they are actually trying to study!

Engineering organizations inside major corporations usually actively engage in process improvement because they are resourced to do so.


If you are interested in this I really recommend you look at writings from the fields that have descended from what was called "Laboratory Studies" and now variously calls itself "Science studies", or "science and technology studies" or "science, technology and society" (STS is a common acronym). For the stuff that's very lab-focused I'd say you could start with Bruno Latour and Steven Shapin, both classics of the older guard of the field.


When someone hits you with a truck-load of mathematical jargon and convoluted experimental setups, most people just end up fatigued and find themselves nodding to conclusions out of fear of looking stupid. And the problem is only amplified by the recent "Science Rocks!" attitude making the rounds in pop culture.


I was recently imagining an experiment classification tagging system: "trial", "reproduced", "peer reviewed", etc. I could imagine this set of information landing on some wikipedia page and the experiment in question would gain a bunch of these tag badges as understanding of the phenomenon matures.


Epistemology is an interesting field, but it's just *not part of the curriculum for engineering and experimental sciences types.

I only had a good introduction to it when I took it as an optional course in a humanities college.


*just not part


Indeed !


The incentive system that academic scientists live under explains why they don't push for studying metascience.

If that's going to happen, it has to come from outside the government-science complex.

Good luck with that...


Dedicated folks that just try and reproduce important papers would be amazing and valuable. Only time I have seen it happen at scale was during the Cold Fusion days.


That's because if cold fusion had been proven to work, there were fortunes to be made.

Most science has very little market value.


Current "science" is often not interested in replication or even testing if experiments can be replicated. The culture is tk be quoted and "science but boring" is not quoted. On top of that there is publish or perish.

Also dont want to point fingrers but some scientists come from places where cheating is the norm.


There is a ton of current and past scholarship on the sociology, history, and philosophy of science, and peer-review and publication is actually a pretty hot topic in those fields. Although yes, perhaps it would be nice if that work was better funded, or if practicing scientists paid more attention to it rather than just repeating old myths about how science works.

For example, here’s a scholarly article on the exact question you mention - how and where peer review came to be seen as a guarantor of scientific quality: https://www.journals.uchicago.edu/doi/abs/10.1086/700070 (tldr: it wasn’t the 17th century Royal Society; it’s much more recent.)


Another classic on this subject is Shapin's "Pump and Circumstance" - which is also freely available:

https://scholar.harvard.edu/files/shapin/files/shapin-pump_c...


Similarly, I heard someone the other day assert that no one had done an double blind placebo controlled trial of the effects of FDA regulation.


That is probably true, but do you have any proposal how such a trial should be done?

RCTs are good when they can be done and I'm all for doing more of them and too often there's no good excuse for not doing them. But at some level things just get impractical.


I worked at a well-known clinical psychology lab at an Ivy League school many years ago. There was so much "gaming" of the results.

For example, our PI didn't want to include a subject in our study because his scores weren't elevated enough, and our PI was worried that his score wouldn't drop enough which would adversely impact our results.

Another example: our active treatment therapists knew exactly what they were treating for, and our study was measuring improvements in the condition that was being treated. However, the control therapists had no idea what they were treating for, and we purposefully kept this information from them!


Hence the “replication crisis”. Replication of experiment and differences in outcome highlight the “core” result. When the results are just totally different, obviously there was “gaming” or just a bad experiment.


The first example doesn't seem that bad. No intervention cannot reduce something if it isn't there to begin with.

The screening should be disclosed in the methods (and, ideally, pre-specified), but you do need to account for floor/ceiling effects somehow.


I still don't understand why we "believed" studies that claimed knowledge about typical behavior and subconscious decisions while those studies were only based on a single experiment with limited participants, e.g. Stanford Prison Experiment. These type of studies should ALWAYS be looked at critically and not just because they fail to reproduce but because they are based on a very small sample size in a very discreet scenario (and probably with participants who are not diverse).


The problem is not just with "belief", but in the process itself. Non-replicable studies are actually cited more than replicable studies:

https://advances.sciencemag.org/content/7/21/eabd1705

Science journalists probably are vulnerable to the same thing influences that lead scientists to do this, except they have even less review on their claims and so they become pop culture sound bites.

As for why non-replicable results are cited more, I'd speculate that non-replicable results are often more unintuitive and surprising, and per the above link, reviewers apply lower standards on these papers in the hopes of finding something truly interesting and/or exciting. Not just in the results mind you, sometimes papers also apply a novel methodology that might be worth wider discussion. I'm not sure that's worth the reduction in credibility though.


> As for why non-replicable results are cited more, I'd speculate that non-replicable results are often more unintuitive and surprising

This seems the likely explanation; I saw a paper recently that showed that lay people can predict what will replicate with above-chance accuracy[1]. I imagine scientists are even better than lay people at this.

So non-replicable results are almost by definition surprising (i.e. they are hypotheses that don't match our current model of how the world works), and surprising results are definitely better news than unsurprising results.

[1](https://journals.sagepub.com/doi/full/10.1177/25152459209196...)


> As for why non-replicable results are cited more, I'd speculate that non-replicable results are often more unintuitive and surprising

I mean, a kind of well known thing is that in general, something false can be more interesting than something true, since it has more degrees of freedom. You can make up anything you want - the truth has to conform to what's actually real.


Same reason we believed eggs will give you heart disease. Science reporting is terrible and the general education system doesn't teach rational skepticism, it teaches unconditional trust of intellectual authority.


every mainstream thought about diet is untrustworthy to me due to the above type of example :/


Yes. The simple fact is that nutrition epidemiology has failed. We don't know much of anything, other than a handful of basics like avoiding loads of refined sugar and trans fats.


That's not completely true, the information is just harder to find and more complex to make popular.

For example, bad fats still play an important role in clogged arteries, but it's more nuanced then that. There are many kinds of fat, and there are other variables in causing fats to clog arteries, such as sugar.

Well, so yes, it does seem that avoiding refined sugar and trans fat and not overdosing on calories, and keeping highly active in terms of exercise, and not being in the same position for too long, and avoiding foods that inflame you (which seems to be very personal), and making sure you get a varied diet of nutrients, is all we know.

I just don't know if that should be framed as a failure. Could just be that it's a hard problem, could be that there are no real pattern to learn about as well. The latter is interesting, because we start our nutrition quest believing nutrition can affect health to a great deal, but that could just as well be false, nutrition could be a very small factor on health.

I think the problem is just the tendency we all have for snake oil, shortcuts and easy way outs. That is where I think this impression of "failing" comes from. That we didn't find something easy to do and that works very well. In that sense, science is characterized by lots of failures.


it is. mainstream diets will tell you:

* saturated fats are bad (lies told us by a crappy Ancel Keys study, promoted for decades by processed food companies (like Kellogs) ran by Seventh Day Adventists who were convinced "meat led man to dangerous impulses and temptation")

* polyunsaturated fats are good. The American Heart Assocation had an article up for years that went as far as to claim Omega6s are heart healthy. They only recently took it down this year. But we know they're inflammatory and we know we're consuming 25-100x more Omega6s than we ever would before the industrial invention of seed-oils being shoved into every product imaginable (bread, cereal, granola, anything that comes in a box, feed given to animals meant for meat production) here's a webmd article on it: https://www.webmd.com/heart/news/20090126/expert-panel-omega... \

* sunlight creates high cancer risk (ignoring that cancer is unlikely, treatment if caught early has a high survival rate, and the risk of not having vitamin d throughout your life risks far more likely autoimmune issues, depression, anxiety and even certain cancers and inflammatory disease).

* sugar is good for you. Sure, they'll specify processed sugars are bad for you or "added sugar", but common wisdom will accept a NET (subtract fiber) 200-300g carb diet as acceptable. Grain is still often listed as the most important and largest part of the food pyramid.

The reality is - all mainstream health advice, including that which you'll get from your doctor who got a whole single nutrition class in school, ensures that processed foods don't loose business on the front end and the medical/pharma industries don't lose money on the back end.

even in the push for a more vegetarian/blue-zone diet world - they're doing so by promoting meat alternatives like "Beyond Meat" which is chock full of so much seed oil and other processed substances, it's mainstreaming vegetarianism-as-fast-food. McDonald's burger.. is still a McDonald's burger and you shouldn't be eating it.

If a factory isn't making it at scale, shoving it in a box, branding it and ensuring you don't have to spend any time making/cooking/preparing whole, fresh foods (those pesky things that tend to have short shelf lives and are costly to Ag businesses), then your PCP, the government, most food businesses, your medical insurance company, absolutely no one of any kind of "authority" isn't going to promote it highly.

They'll do ANYTHING except remove seed oils. They'll make your potato chips out of broccoli and carrots and still drench them in sunflower or canola oil. They'll reduce the salt. they'll make shit out of beets. And still manage to make it horrible for you.

The MSM regurgitates "health" info regarding diets in a way that acts as advertising for these orgs.


what's an acceptable oil? olive? peanut? I was told to use olive oil (not even EVOO except for taste related) so that's what I get but it's impossible to know anything.


Anything thats mono or saturated fat. Mono is probably healthier. But coconut oil, beef tallow, duck fat, butter are all healthier than canola, sunflower, or soybean oil or crisco.


Canola oil is pretty high in mono fatty acids actually. Not as high as some others but pretty good for a commonly available one.


It is, but it's still too high in polyunsaturated fats, most of which are omega6s.

Imho, it's vital to reduce your omega6s as much as feasible. Given our lifestyles and diets, even most "good diets" are still too high in omega6s.


Indeed! And in reality eggs are basically a superfood.


Tangent, but the Stanford Prison Experiment is an awful example of science. It was a researcher who wanted to prove a point and created the conditions to collect the data to prove that point. I hate that it's often the only psychological experiment many people are familiar with.


What about the Milgram experiment? This is also a very well known psychological experiment. Was the science behind the Milgram experiment rigorous?


Milgram's obedience studies didn't involve randomized treatments, and he had small numbers of subjects (typically around 40) in each of his many conditions. On the other hand, Milgram's investigations were serious, systematic, and in good faith -- which makes them worlds better than the Stanford Prison Experiment.

Another reply in this thread suggests that "a large number of participants may have been aware that the actor wasn't really suffering when they administered the punishment." I've studied the topic and found no evidence of this point. In addition, the claim is hard to square with many subjects' reactions -- for example, their nervous laughter and their frequent protests, even as they continued to deliver what they thought were harmful electric shocks.

See https://news.ycombinator.com/item?id=25928569 for more on efforts to replicate Milgram's results.

Tl;dr: there is no equivalence between the Stanford Prison Experiment and Milgram's work on obedience. Milgram's work was superior.


From what I have read (I'm not a psychologist, so take everything I say with a grain of salt), the Milgram experiment had some major ethical issues and doesn't meet modern standards for statistical evidence. There's also accusations that the data may have been manipulated and the results don't directly support the claim that Milgram was making. For example, a large number of participants may have been aware that the actor wasn't really suffering when they administered the punishment. This would seriously skew the results.


Calling it an awful example of science seems a bit extreme.

There is definitely something to learn from such an experiment, albeit not what was intended.


>the Stanford Prison Experiment is an awful example of science.

This may be true, but I don't think the evidence you give supports your assertion.

>It was a researcher who wanted to prove a point and created the conditions to collect the data to prove that point

"prove a point" is the hypothesis

"created the conditions" is the experiment

"collect the data to prove that point" is the observation


No, an experiment should not be set up to prove a hypothesis, it should be set up to test a hypothesis. In one the hypothesis is falsifiable, and in the other it's not.


If people didn't act the way the experimenter expected wouldn't that disprove their hypothesis in this case?


No, because people did behave differently than he expected, and he coached them to behave in line with his expectations. He wasn't a neutral observer, he was a guiding force and the superintendent of the pretend prison. This is what I mean when I say it wasn't an experiment.


Appeal to authority is a big reason. The academic leaders in the field were incentivized to find breakthroughs, or to treat any experiment they ran as a breakthrough, because that’s where their influence and reputation came from. People are so used to trusting experts that they didn’t realize that finding the truth wasn’t the primary motivator for some people.


Who is “we”?

This has been criticized since 1999 and far longer back. How long ago is the Rosenhan Experiment again?

Dare I say that a majority have always held a dismissive, critical view of such matters. But of course, those that hold dismissive views of it are not the ones who work in such fields, and certainly not at the top ready to implement changes, so it can continue to persist and go on despite of being highly criticized.

At least when I studied physics at a university around what must have been 2005, most of the students and professors there were highly critical of softer science and it often came up that some of these papers popped up and were viciously criticized for clear and obvious systematic errors in the methodology.


I assume you are referring to "The Stanford Prison Anecdote"?


same reason we believe observational studies in nutrition, or worse - studies on animals.

if you have the weight of peer review or at least a well documented study, then the media runs wild with it's claims, it gets shoved into textbooks, then governments shape policy on those claims, corporations and medical practices sell gimmicks, books, supplements, therapy and plans of action to heal you... it all becomes lies, half-truths, bad data all just repeating itself ad naseum until "truth" is established in the public consciousness. Quacks on the web, the American Heart Association, your local doctor's office will all pedal garbage based on the bad data. And once it's well established as true, backing away from it is hard because it's become so woven, institutionally.

This is why people think saturated fat and sunlight are bad or at least a net-negative.

Even modern medicine, psychology and nutrition sciences all have horrible replication crisis's and we're no better in rejecting the nonsense now than we were then.


SPE was needed at the time because everyone knew the Germans were like us, but needed a framework to say it.

It also opened up the idea the Japanese were like us too.

It allowed us to say what we believed and build on that.

It's not science. But science isn't the only way forward.


Can't we just look at history and conclude that we are "just like the Germans"? Why do we need to shroud it in pseudo-science?


They are believed because they are sensational, they are titillating, and it gives people the excuse for their bad behavior by believing that everyone would be a monster in the right circumstances.


And that's without knowing how the study was really executed. Where I worked, there was a relatively successful postdoc who presented the results of his p<.05 significant pilot study. When asked, he said it wasn't the first pilot. It was the 20th.


In the department I worked for as a faculty member, there was an issue where the other areas would have graduation rates around 40-50%, as opposed to our area which was above 95% or so. This came up in the context of an external review.

When people asked around, informally what was said was that the grad students in the other areas (especially one area, in the experimental molecular biosciences) would leave after having to "redo" their dissertation over and over again. Essentially what would happen is they would propose a dissertation study, it would be approved by the area committee, the student would do the study, and it would produce null results. So they would be told to redo it a different way, or to pick a different topic, it would get approved, and the same process would happen again. After this happened a few times, with the student being told they had to produce significant results, the student would grow despondent and leave the program.

What's sad about this is that it's formally reinforcing p-hacking basically, as part of the degree program. But it's even more absurd than what's often alluded to in meta-science writings, because in these cases you would have a formal graduate committee, composed of faculty, deciding that the dissertation thesis is a good one -- that the hypothesis and design are solid, and formally approving the dissertation proposal -- and then because the results are null, it's unacceptable. If this was being done so casually in that forum, I can't imagine what goes on behind the scenes.


What makes that even more absurd in my books (apart from fostering an unhealthy academic culture) is that the purpose of a doctoral dissertation is essentially to show that you can do proper original research.

Getting a null result doesn't invalidate that in any way.


Exactly. In our area, the dissertation itself and defense were evaluated on the presentation of the research, not on the results, provided that they followed the proposed plan.

If you have a committee of experts who carefully evaluate a proposal and decide it's good, the results are as they are.

Broadening the discussion a bit, it seems one feature of science, as opposed to, say philosophy, is that the conclusions regarding a hypothesis are not knowable a priori. I think in contemporary academics there's some implicit idea that the quality of a researcher lies in their ability to identify hypotheses that are "correct", as opposed to simply following through with good but ultimately "incorrect" hypotheses. There's a bit of a roll of the dice involved with science; if there isn't, it's not science.


> it seems one feature of science, as opposed to, say philosophy, is that the conclusions regarding a hypothesis are not knowable a priori.

That should be a defining characteristic of any academic inquiry, regardless of whether it's science or not.

I have no training in non-quantitative fields, and my academic experience is from CS where the "science" part is often so-so. As such this should be taken partially as a layman view. However, my impression is that while in non-scientific academic fields the research isn't necessarily taking the form of explicit hypothesis testing, more or less similar criteria for intellectual inquiry should apply.

The research might be more about observation and critical (often non-quantitative and non-absolute) evaluation of arguments, and as such the validity of the methods (such as whether the hypothesis is assumed or genuinely questioned) might not always be as easy to judge [1].

The process might not be as easily formalized or judged as in science, but the mentality of critical inquiry should be similar. If the hypothesis is assumed and not questioned, that's no longer any kind of academic inquiry. It becomes politics, in the pejorative sense.

> I think in contemporary academics there's some implicit idea that the quality of a researcher lies in their ability to identify hypotheses that are "correct", as opposed to simply following through with good but ultimately "incorrect" hypotheses.

I think that's partially just psychology and human nature. We like results that make us directly know (or think we know) more, and results that basically tell us we still don't know less. Few people like uncertainty.

The society outside of the academia certainly values the former more than the latter, and funding and other external incentives probably exacerbate the underappreciation of negative results.

[1] Or perhaps it is, to an expert, but having that judgment would require the kind of experience in those fields that I don't have.


I have a similar horror story.

A medical student worked hard to analyze, say, 40 x-rays out of hundreds available. He found no significant evidence for some hypothesis. When he told his supervisor, the reply was: "Well then you should just analyze some more x-rays. I'm sure you'll have a statistically significant result at some point."


Which why P-scores need to be proportional to some nontrivial inverse factor of the number of experiments done in the field. Are there 10000 researchers doing 20000 experiments per year? Take 20000, multiply by the number of years we expect an academic to do hands on work (30?); invert: you need a P-score better than 1:600000.


This! Bonferroni correction across all conducted studies/analyses. I will suggest this next time I am reviewing a paper with crazy claims and shitty statistics.


I know this is mostly a joke, but what you really want is Benjamini–Hochberg correction, unless you want to prevent even a single false discovery in all of science. FDR vs FWER


Somehow one of the 3 experimental groups has 16 people, whereas the other groups have 14 people. Wonder why...


For what is worth studies do end up with different amounts of people due to dropping out, people not matching criteria which are only checked later, indivisibility by number of groups etc.


Yes, there are legit reasons for this to happen, as well as cherry-picking.


Much of such dubious "science" is packaged by self help authours as miracle principles which are going to set ones life right. Popular books like "Power Of Habit" and Cal Newport's books also seem to fall into this category, and seem to enjoy huge patronage, even in HN.


There is a cottage industry of what you may call,if you feel uncharitable, bullshit peddlers:

Simon Sinek

Cal Newport

Charles Duhigg

Mark Manson

Ryan Hollyday

Malcolm Gladwell

The list is endless to be honest, they are each different on their own way but they have the following common points:

- They are on this for the money, so expect them to be always pushing their books, products, next book, next tour, next program.Hustling, hustling,hustling.

- Their grandiose pronouncements with little or not serious backing.

- Their unwarranted sense of speaking from a position of authority

- The over-simplication and stupid generalization of what it is messy, complex and very much unique.


I don't understand why the desire to make money from one's work is correlated with how bullshit the work is. It's not. Desire to make money is only sometimes associated with with bullshit (but we tend to remember it more because it leaves a bad taste). If that was not the case any capitalistic society would have never worked at all to begin with.


A good rule of thumb is to assume increasing probabilities of bullshit for subjects further to the left on this scale:

https://xkcd.com/435/

Even biology outside of a cellular level is already above 50% BS for me. It is just insanely difficult to have the necessary controls.


On cellular level too! I remember looking at some summarising paper on Duchenne muscular dystrophy for educational purposes. There was a question regarding concentration of certain molecule inside affected muscle cells. 5 papers were quoted - 3 were showing that the concentration is way above normal if DMD is present while other 2 were showing complete opposite. You can not imagine anything like that in math or physics.


Most popular "nonfiction" fits into this category, really.

There is a pattern to these books

* Pick a topic

* Decide on a narrative

* Pick a collection of studies to demonstrate that narrative

* Take study conclusions (which are often dubious extrapolations of data) and summarize vaguely adding additional unsupported projections, a handful per chapter

* Publish and promote

It isn't just "self-help" but nearly everything in the nonfiction section that you hear people talking about.


I'd bet that even more dubious science is packaged by large corporate interests. Take Big Tobacco, who had doctors saying smoking is healthy, or the sugar industry, corrupting guidelines and policy on coronary heart disease [1]. And medicine is not immune either, with financial conflicts of interest and pharmaceutical sponsorships correlating with outcomes. [2]

[1] https://www.npr.org/sections/thetwo-way/2016/09/13/493739074...

[2] https://en.wikipedia.org/wiki/Metascience#Medicine


I don't disagree with you, but might provocatively suggest that federal funding in the US vis-a-vis a large university could also be considered a large corporate interest at this point.


Careful now... You might make someone skeptical of vaccines. /s


Who do you look to for wisdom about how to live your life?


In general, nobody who is trying to sell it to me.


I absolutely love this answer, it's the only correct answer. I look to the people around me who have modeled leadership, good loving relationships, and productive respectful communication... then I try to mimic their behaviors. Those people are far wiser in their actions than any of those authors are in their words.


May you please explain what Cal Newport has stated , that is considered "dubious science."


I'm interested as well. I agree with the overall point and do think Cal focuses too much on expert performance (ericsson et al) while completely ignoring tacit knowledge for some reason. But I've been reading his stuff for years and never thought he was one to peddle his own products beyond "this worked for me, might work for you".

In that sense I find him way better than other authors listed: he actually makes good use of the tools he recommends as a professional (as opposed to making a living spouting bullshit about other people's work).


Is there a method of flagging papers in scientific journals that have been criticized or refuted, e.g., by later studies or proven inability to replicate the data?

In legal research services, like Lexis Nexis or Westlaw, many cases are "flagged" when a later case or statute reverses, narrows, or otherwise affects the earlier case. This system warns lawyers that they may not be able to cite the flagged case in their current work. Of course legal research services also come with their own issues and costs; some of which are likely associated with this system.


A journal can publish a retraction.

The website Retraction Watch[1] aggregates these retractions and provides a database that you can query. Reference management software like Zotero[2] can use this to monitor your collection of papers and notify you when one is retracted.

[1] https://retractionwatch.com/

[2] https://www.zotero.org/blog/retracted-item-notifications/


Yeah. They could. But few (zero?) studies are retracted for the sake of being proven incorrect later. And, to be far, it would be ridiculous. Imagine having your career nullified because when you're 60 some major break through shows that your studies aren't relevant anymore. Your work was good when you did it, but now there's something new. It's kind of the definition of scientific progress.

However, as a counter example, in my very narrow specialty there is a well known lab that has produced highly cited bogus studies. I've personally published opposing results and said, "these studies are wrong for these reasons" using almost exactly those words. Should they be retracted? Absolutely. Will they ever be? No. Because, of course, the publisher and the authors just point the finger back at me and say "no, you're wrong!" and that's more than enough to keep the vague debate going.


An issue that comes to mind is, ironically, authority. Who decides when a paper has been discredited? I can see all sorts of incentive problems with such a system. On the other hand, Westlaw is only cataloging what has already been decided. If the Supreme Court overturns a prior case, then it's overturned whether you agree with the reasoning or not.


I have no basis to argue against the article, but I would think there is more incentive to game it now so I wouldn't take it as a given that things have improved.


Bonferroni effects have, at least in the popular consciousnes, become much more front and center. I wonder how many people on the stre can define p-hacking. Most educated people know about the reproducibility crisis, and I would wager that around half of them understand the cause. That seems like it might create a better environment.


> around half of them understand the cause

I’m in the half that doesn’t know, apparently.

I considered that with hard sciences, the cost, time, equipment, conditions, etc were limiting factors. I think this paper on subatomic particles changing into antimatter is interesting, it was observed in a unique facility, once, under some condition that can never repeat, etc. Just kinda have to take your word for it.

As to soft sciences… I really don’t know there. Give me a hint?


Bonferroni effects. In the parent article referred to as researcher degrees of freedom, in pop culture referred to as p-hacking. In machine learning, it's referred to as overfitting. They are all the same phenomenon, of having more than one hypothesis, and then using statistical measures that assume that you only have one hypothesis to claim confidence. Journals only publishing positive results is an example of this phenomenon. Changing your hypothesis to match your data is another such example.


What I don't understand is psychology research (at least in academia) does not seem to have moved behind the "we locked three dozen college kids in a classroom and had them perform some bizarre acts, through which we hope to pierce the veil of human nature" - style research. I feel like if something good could have come out of the social media age - is that we have documented the natural behaviour of a vast number of people over long stretches of time. I think this is the sort of invaluable data, that has the potential to advance the quantitative understanding of human nature.


I've always been perplexed by this, too.

Aren't the two most important aspects of the research the data set and the study methodology? Why on earth would you skimp so heavily one of them?

I don't work in the sciences, but this kind of nonsense doesn't exist in the "actual" sciences. Physicists spend loads of money producing just the right experiment conditions and documenting the manner the experiment was created in. The dataset is incredibly important and very rigorously examined.

But in psych, the dataset is basically an after thought. "Oh by the way, we chose a small handful of kids who happened to be free at that time, with no reason to believe there's any geo, social, educational, political, or ethnic background diversity, it probably cost us like $200 plus some pizza. Now let's print the results in $5 million worth of textbooks for a few decades!"

I don't buy the funding argument. A professor probably costs the university 100-150k/yr and will be working on a small handful (2-6, ish?) of projects. Buying an hour of a subject's time for a study must cost, what, $30/hr? Shouldn't they be allocating a minimum of $50k in funding for the actual research, and dropping at least $10k for a good dataset?

I don't buy the argument that most experiments don't yield good results so the university is wary of funding them. At a minimum they should follow up a cheap test with promising results with a real experiment that has actual funding before everyone gets all excited about it.


The part that confused me the most, as someone who has never done research science, was how the questions were casually categorized as gender, ethnicity or no-identity salient.

If someone is from Brazil and is of second or third generation Japanese descent, how much of the questions are 'salient' to Brazilian identity vs Japanese? There's an unspoken implication that part of the 'good at math' stereotype relies to some degree on speaking a non-English language at home, which I don't think is a safe assumption at all.

> In the Asian identity-salient condition, participants (n = 16) were asked (a) whether their parents or grandparents spoke any languages other than English, (b) what languages they knew, (c) what languages they spoke at home, (d) what opportunities they had to speak other languages on campus, (c) what percentage of these opportunities were found in their residence halls, and (f) how many generations of their family had lived in America.


But this would bias results towards zero, no?


> If you think psychological science is bad

I know it's often repeated, but is there evidence that "psychological science" is worse (by some measure and to a significant degree) than other social sciences? As long as we're talking about science, let's look at evidence! ;)

Anyone studying human behavior, on any scale, has the added challenge that it's so complex that you can't really isolate simple mechanics, as you can in physics or chemistry. People are far more complex than a molecule, or even a billion molecules.

BTW, I'm aware of the well-known cite, the NY Times article on the 'reproducability crisis' from a few years ago. Few seem to have read the article: The results were reproduced, but many were at lower strength than the original research. That's important, but it's not like the researchers just complete missed and results were arbitrary.


The realities of science as actually practiced rather complicates the religion of "trust the science!" (which usually actually means "trust the scientists!")


The science-as-religion people think that science delivers truth in the same way any other theology does.

I took a philosophy class filled with people in science degree programs and a few of my classmates were often vocally upset about how nothing was certain in philosophy and everything had multiple sides to it. That was very eye opening, many of these people were soon to be graduated and through their entire educational career they had only been exposed to Truth to the extent that being shown debate and disagreement on a topic made them upset.

You're not supposed to "trust the science" you're supposed to trust the process to approach the truth. If you can't read multiple arguments on the same topic and analyze them, you really don't get it at all (and waaay too many people with degrees can't do this).


> You're not supposed to "trust the science" you're supposed to trust the process to approach the truth. If you can't read multiple arguments on the same topic and analyze them, you really don't get it at all (and waaay too many people with degrees can't do this).

Agree entirely. I’ll add that I go slightly further. If you can’t take your pet topic, and can’t make even a slightly good faith argument against yourself, you have no business with strong feelings on it.

I have a wedge issue topic I am an expert on. I could argue against myself, both effectively, and in an actual compromise that no one wants.

Yet… people who argue against “my side” are constantly using complete bullshit science from the 1980/1990s when governments literally weaponized depts and ivy leagues to push for “evidence” to support their desired policy changes.

These people now tell me to “trust the science”, “I’m sure this researcher at Harvard is wrong and you’re right”, and ”this article from CNN / FOX / VOX / WAPO proves you are wrong and I refuse to consider they have an agenda”.

Worst part that there is no shame of willful ignorance, they “trust” the people they claim can’t be wrong, simple, done. Why should they bother to acknowledge another side - if they do it means everything else needs reevaluation too.


“In God we trust. All others must bring data.” - W. Edwards Deming.


I don't even trust data anymore


Agreed. It's not like the good old days when data was about one physical process that didn't have too many downstream impacts.


I studied CogSci in the late 90s, which involved taking psychology courses. Every psych course that ended in zero required you to participate in psychology experiments.

The profs like to say that 18-22, mostly white/asian, mostly rich kids are the most studied group of people in the USA.

Or course we now know that a ton of psychology research doesn't actually apply to people outside that small narrow window of people.


The social sciences have a term—“WEIRD” (Western, Educated, Industrialized, Rich and Democratic) to describe the population that most psychological and social science research has studied the most deeply. As the acronym implies, these populations are not normal across human history or even the modern world.

One of the fascinating concepts in abnormal psychology is the notion of a “culture-bound syndrome”—a mental illness that only occurs in a specific cultural milieu. I wonder how many mental illnesses are actually culture-bound syndromes of WEIRD culture?


It doesn't apply within that window either.


“Don’t hate the player, hate the game.”

Indeed. It is a scale problem. We have too many producers of research, too few destroyers of research, like Gelman. Show me the incentive and I can tell you the outcome. Encourage the whole world to become “experts” and then be amazed as the reverence and trust in expertise is devalued. That’s us.


If it is not a hard science, then it is likely a variant of philosophy.


I think you could effectively argue that mathematics is a branch of philosophy.


You can construct proofs for mathematics, but not for the existence or refutation of a divine creator.


I think you have a narrow view of philosophy. Very little modern philosophy has anything to do with that kind of question.


Nope, you're projecting and using a strawman. That's not the point. :)


True, but with math you also eventually get down to axioms that can't be proven.


I don't think we know if we can know everything or not. That seems like _the_ question to answer.



What is a “hard science”?

The only distinction I care about is exactness and non-exactness.

Is the research based upon formulating a theory that is capable of forecassting not-yet observed events a nonexistent, exact margin or error, and are the conditions then re-created to see if the forecast is within the margin of error that the instruments that measure it have?

Some say biology is “hard”, and some say it is “soft”; some say many parts of cosmology are “hard” but they certainly aren't “exact”.

In exact science there are typically multiple ways to derive the same answer within one theory, and they all result into the exact same result.


This has been answered a million times before.

https://en.wikipedia.org/wiki/Hard_and_soft_science


And it starts with that the terms are colloquial and bereft o an actually hard definition.

“colloquial”, “roughly”, ”perceived”. — these are not the terms that definitions are made of.

The point is that there is no actual hard distinction between “hard science” and “soft science” but there is a hard distinction between an exact theory, and an inexact theory.


There is no exact science, physics isn't an exact science either.

You need softer definitions like "Hard" and "Soft" precisely because there are no exact results in any science we have. And it is fine to use soft definitions for these things since categorising scientific fields doesn't need to be a science.

A hard science is where replication is expected to never fail, if replication fails once then the theory is thrown out. Applying that to social science studies would seem ridiculous, no social scientist would want that, so they want their field to be soft.


> There is no exact science, physics isn't an exact science either.

Which is why I said “exact theory”.

> You need softer definitions like "Hard" and "Soft" precisely because there are no exact results in any science we have. And it is fine to use soft definitions for these things since categorising scientific fields doesn't need to be a science.

There are exact results every day, but those are not delimited cleanly by “fields”: a theory is exact or it is not.

> A hard science is where replication is expected to never fail, if replication fails once then the theory is thrown out. Applying that to social science studies would seem ridiculous, no social scientist would want that, so they want their field to be soft.

And this criterion is never mentioned at any point in the Wikipedia article linked.

It also seems a useless definition as replication can always fail due to flukes, and the confidence numbers chosen for replication, typically within 0.05, are very arbitrarily chosen.

Whether it is “replicated” or not is a rather arbitrary delimitation of an arbitrarily picked number, and 5% is certainly not improbably low to begin with.


‘Hard science’ is just a title people have been using to distinguish physical sciences from social sciences. Don’t get too hung up on worrying what ‘hard’ means or thinking that it is some kind of difficulty or value judgement; it’s not. It may have been at some point, but today it’s just a category and nothing more. This WP entry is much more clear than the above one, IMO: https://simple.wikipedia.org/wiki/Hard_science


Reading the other comments about the lack of integrity in scientific research that they’ve witnessed, is it any wonder why people might be skeptical about important issues like climate change, vaccines, masks, etc? There’s great research being done and a larger part of the population is becoming more and more skeptical because lies are being published and touted. Science needs to clean its house.


Independent replication needs to be incentivized, either as part of tenure process or have grad students do it as part of getting a PhD or something. Then the results are published in a journal of replication studies.


The skepticism about climate change is due much less to issues with the scientific community that than it is to decades of propoganda funded by people who fear losing money to efforts to fight it.


Are you suggesting there is only money, power, and propaganda by deniers of climate change? Don’t you think it’s possible there are people exaggerating it for money, and power?

I don’t think ”they use propaganda and everyone who disagrees is just wrong” is a good argument when the topic is “we know in other areas there are issues with the scientific community so why not this one”.

We can still have the right answer and have gotten there the wrong way.


Psychological science is now thoroughly privatized. Don't read papers if this subject is of interest to your, look al big tech, ad tech, troll factories, influencer science, media consolidation, for-pay research-charities (great rabbithole to dive into btw).

Academics are thoroughly out of the loop on this one, as are we all incidentally.


I became a Data Scientist because in 3rd year, I found the OKCupid data blog and realized their research had methodological soundness these "20 undergraduate lab students forced to be there" research never would.

You trade 'shot-in-the-dark lab experiments' for 'a clear and obvious agenda'. The trick is to just make sure the agenda isn't morally reprehensible.


more than one third of the entire world population believes that OKCupid is "morally reprehensible" (!)


This is an interesting statement. Do you have links to something more I can read? Who thinks dating sites are morally reprehensible?


[flagged]


Ok, I have some reason to agree with you, but if we rule out religion as a valid basis for morality, what do you suggest we replace it with?


I'm not even sayign religion can't be a basis for morality, I'm just saying religious BIASES aren't valid arguments. Don't kill/murder/steal, those are great, clear lines about interacting with others. There's a clear benefit to all for those rules. Don't turn on your oven on saturdays, pet dogs, or cut your beard, those aren't widely applicable.


There's a pile of different ethical frameworks. So likely work with a few of them and factor them all in when making decisions.


Act or Rule Utilitarianism.


> for-pay research-charities (great rabbithole to dive into btw).

Any getting-started pointers for this (or your other suggestions)?


A useful keyword is psychographics, but you'll find that much research is privatized (comes from media/software companies), and that some of the meat seems to be not publicized. Which makes sense, it's what gives them an edge over the competition. The Cambridge Analytica dossier is of course a good way to find references.

In terms of the why, and the who's paying, Dark Money (Mayer) and Democracy in Chains (MacLean) touch on it: psychographics is huge for anyone doing manipulation, not just commercial ads, also political ones. Probably even moreso.


Do we have to imagine? I'm sure many scientists in the field today were working in 1999. There's certainly enough data from the time to look at this in some level of detail, but Cherry picking an article and pointing out its flaws doesn't really prove much about the state of the field back then.

Maybe if the author mentioned that this article was highly regarded back then, there would be a point, but for all we know, the article was thought poorly of at the time and contemporary scientists just thought it slipped through the cracks.

It also doesn't talk about new controls in place today that would prevent a similarly poor article from being published, or even a system of "retracting" poor articles. I don't really trust that everything being published today is without flaw. After all, the other examples of bad science given are fairly recent.


Interesting how as I began reading the excerpts from the paper I had to stop myself and realize that the author himself had ironically conferred a bias against the paper, LOL. I was primed to find all the flaws.

I think psychology and sociology are legitimate and worthy studies, but they run into issues with the scientific method itself due to the ambiguous and “high-level” nature of their concepts and theories. It’s hard to create meaningful, repeatable experiments. So perhaps it should be emphasized how important it is to put effort into constructing experiment... and in particular keeping the subjects unaware of what it being tested. There are probably many great examples of experiments done.



I don't get the downvotes.


"Science"


My 5 cents.

There is not one but two psychological sciences at the moment. One is public, publicly funded and in very bad shape, with most of its results being not reproduceable (reproduction crisis), and a partial destruction happening via neuro-sciences.

Who destroy, but do not offer large-scale replacement theories, that could encompass the whole species and are not in contradiction to other neuro-science results.

And then there is the second faction (Disclaimer: I can not proof what is deduced after this disclaimer.)

There are several cooperations and at least one government, which had the chance to large scale collect data on the population.

This data is a psychological gold-mine, if explored properly. One could query such a behavioral database and more important - enact virtual experiments.

Out of all male humans, who curse in front of the tv in the evening, filter out those who get into a car accident, plot the increase in cursing in front of the tv.

If taken to the extreme, this new, data-mining behavior sciences, could create a agent based model of the species in all variations and collect data only to check the expected outcome of a societal change against the real outcome, with spot samples.

I have my own little pet theories, how humanity would look to this privatized psychology, but i digress.

I think, academic psychology should have full access to all cooperation databases that contain behavioral data.


> I think, academic psychology should have full access to all cooperation databases that contain behavioral data.

No thanks. We already have quack science from the 1950s still driving policy discussions on wedge issues today. I don’t want more convincing shit, I want less shit.


Actually I do not think that paper is that bad compared to majority (well I do not think Psychology currently is a science)


This is a flawed logical argument. Imagine this article from 1830:

"If you think phrenology is bad, imagine how bad it was in 1810".

Rinse & repeat for the appropriate time frames for alchemy, astrology, or any other field that tried to misapply science. Just because a field is studied for a long time or tries to apply the scientific method doesn't lend credence to the approach. All it means is, at best, we've managed to toss some things that are now obviously wrong/flaws. In 20 years we'll be doing the same to things we "know" today OR the flaws will remain because we don't have the math/science to demonstrate the flaws more obviously & there's social pressure to keep "building" (even if the foundation is flawed). However, as we should all be aware, false knowledge grows exponentially more quickly than our true understanding of the universe because our imagination is limitless.


To anyone downvoting me, consider this HN article from not too long ago: https://news.ycombinator.com/item?id=27489927

The general premise with these studies is that if an effect size is real, then a preliminary study would show something interesting. To my knowledge, statistically that is a nonsense argument. Small sample sizes suffer from various small sample effects to the point that you can't predict either way (otherwise there wouldn't be a point in doing a larger study). To add insult to injury, all of these kinds of studies are only on local college students, which further invalidates any potential information gleamed from a preliminary study.

TLDR: The way science is done in the social sciences is fundamentally flawed & the fact that limited funding ensures that's the case doesn't excuse that a significant enough part of the body of knowledge isn't reliable.


This is what happens when everyone is told they're awesome and nobody is allowed to fail


Gee you talk about 1999 like it was more than 20 years ago. Oh wait … (damn I’m old)


1999 to me was almost yesterday.


I went through it in a clinical sense around that time and this was at the time that I had no legal right to demand to read reports about me for correction, which came later, and when I did it, the tunnel vision in it explained many things about the conversation.

What I remember most is that the clinical psychologist kept fishing as to why I covered my face with my hair and I kept saying that there is no reason other than gravity and that I cannot control that my hair obscures parts of my face and the report contained that I did it on purpose to hide my face, which I'm fairly certain I did not, but it seemed that this was really what the clinical psychologist settled on early and continued to search evidence in support of.


Those dummies. They had perfectly good time machines in 1999 and they didn't even use them to visit our enlightened era. I bet they did all their calculations on some old Windows 95 system instead of investing in solid multiple core machines.

Still, they don't hold a candle to Gregor Mendel failing to incorporate DNA in his genetic work, if you can believe it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: