My whole perception of academia and peer review changed that day.
Edit to elaborate: like many of our institutions, peer review is an effective system in many ways but was designed assuming good faith. Reviewers accept the author’s results on faith and largely just check to make sure you didn’t forget any obvious angles to cover and that the import of the work is worth flagging for the whole community to read. Since there’s no actual verification of results, it’s vulnerable to attack by dishonesty.
When I got into university and started alternating studying and work, I realised just how incredibly clueless even adults are. The "let's just try something and hope nothing bad happens" attitude permeates everything.
It's really a miracle the civilisation works as well as it does.
The upshot is that if something seems stupid, it probably is and can be improved.
A lab exercise like that could really just be selecting for chutzpah (feeling charitable) or arrogance (less charitable).
Very valuable lesson, although it sure did suck at the time.
He got criticized for it.
Robocall scams are very high on the profit:human misery scale, but their hardly going to end civilization. Pollution, corruption, theft etc all make things worse, but we never see the better world without such things so it all feels very abstract. Of course you need to lock your doors etc that’s just the way things are.
It has to work well enough to… work… and reproduce. That’s it. It’s not “survival of the fittest.” It’s “survival of a randomized subset of the fit.”
There’s even a set of thermodynamic arguments to the effect that systems are unlikely to exceed such minimum requirements for a given threshold. For example, if we are visited by interstellar travelers they are likely to be the absolute dumbest and most dysfunctional possible examples of beings capable of interstellar travel since anything more is a less likely thermodynamic state.
So much for Star Trek toga wearing utopian aliens.
Otoh, they would be aware about that, and they might have spent some time improving how genes (or what they have) and evolutionary selection works for them, so that, say, their species with time becomes brighter and brighter than what's actually needed. If they wanted to do that.
Also more intelligence does not equal better ideas. The world is full of crazy or amoral people with apparently very high IQs. Your average flat Earther probably has an above average IQ.
Improvement is a war against entropy and n^n^n^… combinatorics any way you slice it.
Slowly across hundreds and thousands of generations.
By adding evolutionary pressure, for what they want -- it'd be up to those space traveling aliens to decide -- they can change their species, generations into the future.
> Improvement is a war against entropy ...
Reasoning in that way, the humans would not have gotten brighter than the chimpanzee monkeys. There's been evolutionary pressure for the humans to get brighter, and it would be possible for you (I mean the humans), or the space travelers, to add artificial ev. pressure.
Anyway never mind all this, maybe talking about space travelers and the humans and their genes isn't the best way to spend the day. Have a nice day btw
So after a certain time spent, you are left with a choice of ‘massaging’ the data to get some results, or not and getting left behind those that do or were luckier in their research.
That can end up being just as time consuming as doing the research to begin with. Often there is no time and no money to go back and do that. If your 'budget' is 6 month you're going to spend 6 month trying to get your experiment to work. You're not going to 'give up' after 4 month and spend 2 month putting together a "why we failed" paper.
For example, I imagine that archeological work is extremely high impact if excavation efforts led to discovery of ancient city.
Archeology paper would probably be less interesting if the paper said “we dug this area, found nothing”.
If one were to judge those two papers, obviously the discovery paper is higher impact than the negative result.
Not as valuable as a discovery, but very far off zero value. Yet the reward in academia would be near-zero.
Ultimately, this is the problem.
I think this all the time.
Still, reading your comment makes me despair. It plants a nagging doubt in my mind, "how many of these zillion studies cited that are actually replicable?" This doubt remains despite knowing that the scientist is one of the leading experts in the field, and very down-to-earth.
What are the solutions here? A big incentive-shift to reward replication more? Public shaming of misleading studies? Influential conferences giving more air-time for talks about "studies that did not replicate"? I know some of these happen at a smaller-scale, but I wonder about the "scaling" aspect (to use a very HN-esque term).
PS: Since I read Behave by Sapolsky — where he says "your prefrontal cortex [which plays critical role in cognition, emotional regulation, and control of impulsive behavior] doesn't come online until you are 24" — I tend to take all studies done on university campuses with students younger than 24 with a good spoon of salt. ;-)
It’s probably not all bullshit but I would bet a double digit percentage of it is.
Many people might spare themselves at least some misery by educating themselves about evolutionary psychology, including the landmines and open questions.
Therefore it is not well suited to figure out the world.
You should treat all of it with extreme helpings of salt.
Also, I don't think ending poverty is a major stated goal of psychology research. . .
I think the problem is much bigger than simply a binary is it replicable or not. It’s extremely easy to find papers by “leading experts” that have valid data with replicable results where the conclusions have been generalized beyond the experiments. The media does this more or less by default when reporting on scientific results, but researchers do it themselves to a huge degree, use very specific conditions and results to jump to a wider conclusion that is not actually supported by the results.
A high profile example of this is the “Dunning Kruger” effect; the data in paper did not show what the flowery narrative in the paper claimed to show, but there’s no reason to think they falsified the results. Some researchers have reproduced the results, as long as the conditions were very similar. Other researchers have tried to reproduce the results under different conditions that should have worked according to the paper’s narrative and conclusions, but found that they could not reproduce, because there were specific factors in the original experiment that were not discussed in the original paper’s conclusions -- in other words, Dunning and Kruger overstated what they measured such that the conclusion was not true. They both enjoyed successful academic careers and some degree of academic fame as a result of this paper that is technically reproducible but not generally true.
To make matters worse, the public has generally misinterpreted and misunderstood even the incorrect conclusions the authors stated, and turned it into something else. Almost never in discussions where the DK effect is invoked do people talk about the context or methodology of the experiments, or the people who participated in them.
This human tendency to tell a story and lose the context and details and specificity of the original evidence, the tendency to declare that one piece of evidence means there is a general truth, that is scarier to me than whether papers are replicable or not, because it casts doubt on all the replicable papers too.
Out of curiosity, what's the title of the book?
There's obviously more complexity than this, but I believe that if even a relatively small percentage of the population started thinking like this (particularly, influential people) it could make a very big difference.
Unfortunately, this seems to be extremely counter to human nature and desires - people seem seem compelled to form conclusions, even when it is not necessary ("Do people have ideas, or do ideas have people?").
Isn't that the moment where you try even harder to falsify the claims in that paper? You already know that you'll succeed so it wouldn't be a waste of time in your effort.
The main problem is that even if you reproduce their experiment, they can claim that you did some step wrong, perhaps you are mixing it too fast or too slow, or the temperature is not correctly controlled, or that one of your reactive have a contamination that destroy the effect, or magically realize that their reactive that is important.
It's very difficult to publish papers with negative results. So there is a high chance it will not count in your total number of publications. Also, expect a low number of citation, so it's not useful for other metrics like citation count or h.
For the same reason, you will not see publications of exact replications. A good paper X will be followed by almost-replications by another teams, like "we changed this and got X with a 10% improvement" or "we mixed the methods of X and Y and unsurprisingly^W got X+Y". This is somewhat good because it shows that the initial result is robust enough to survive small modifications.
This is an example which did get cites:
But despite the high visibility, you can see the large number of papers published based on the original myth.
And this refutation doesn't have great methodology (but other ones do). It's mostly cited due to strong language used.
Assuming not good faith for peer review would make academia more interesting, only way would probably for the peer reviewer go to the lab and get live measurements shown. Then check the equipment...
One reasonable approach might be to look at which group has produced the 'best' research over the past few years. But how do you judge that in a way that seems fair? Once you have a criteria to judge that, then people will start to game that criteria.
Or taking a step up, The university needs to save money. How do you judge if the Chemistry department or the Computer Science department should have its funding cut.
No matter how you slice it at some point you're going to need a way for someone to judge which of two departments is producing the 'best' research and thus deserves more money, and that will incentivize people to game that metric.
We aren't short on food, shelter, clothes, tech, etc - those are all solved problems.
The problem that isn't solved is stupid people sitting in charge of decisions they don't have the brain make-up to comprehend or manage, making pretend they know what they're doing, holding people far superior to them hostage.
One problem is PhD degrees are too costly to those who don't get academic or industrial success from them. But as long as talented people are willing to try to become a professor I don't see the system changing.
I think many more are drawn to professorship for a sense of status, ie prestige. It shows in their overwhelming mediocrity, eg the failure of economics to progress to a biologically scientific paradigm.
In the past people who did science could do so with less personally on the line. In the early days you had men of letters like Cavendish who didn't really need to care if you liked what he wrote, he'd be fine without any grants. That obviously doesn't work for everyone, but then the tenure system developed for a similar reason: you have to be able to follow an unproductive path sometimes without starving. And that can mean unproductive in that you don't find anything or in that your peers don't rate your work. There'd be a gap between being a young researcher and tenured, sure.
Nowadays there's an army of precariously employed phds and postdocs. Publish or perish is a trope. People get really quite old while still being juniors in some sense, and during that time everyone is thinking "I have to not jeopardise my career".
When you have a system where all the agents are under huge pressure, they adapt in certain ways: take safer bets, write more papers from each experiment, cooperate with others for mutual gain, congregate around previous winners, generally more risk reducing behaviour.
Perhaps the thing to do is make a hard barrier: everyone who wants to be a researcher needs to get tenure after undergrad, or not at all. (Or after masters or whatever, I wouldn't know.) Those people then get a grant for life. It will be hard to get one of these, but it will be clear if you have to give up. Lab assistants and other untenured staff know what they are negotiating for. Tenured young people can start a family and not have the rug pulled out when they write something interesting.
A better solution would be to stop overproducing PhDs. We could reduce funding for PhD students and re-direct that towards more postdoctoral positions - perhaps even make research scientist a viable career choice?
I'm suggesting that we re-direct some of the funding for training PhD students into funding for postdoctoral positions (via either fellowships or research grants). Professors would still get their research team, but rather than consisting mostly of untrained PhD students, they'd have a smaller, but more effective team of trained researchers.
Immediately after undergrad is how it used to work in the golden days of science, more or less.
If the competitiveness is the problem maybe tenure should be a lottery that you enter once at a fixed stage, preferably before you're expected to start publishing in journals.
A tenure lottery seems like an extreme option - there has to be a middle ground between what we have now and something entirely random.
The act of producing a doctoral dissertation usually leaves something of a mark on one's outlook, skills, etc. I claim it is a _distinguishable_ achievement for life.
Don’t (credible) journalists have an honour system of getting at least three sources for a story?
Can’t we make researchers get at least two more confirmations from separate teams for something far more important?
If publication would require two more confirmations from separate teams, that would mean (a) doing the work in triplicate, so you get three times less results for the same effort; (b) the process would take twice as long as I spend a year doing the experiment and then someone else can start and spend a year doing the same experiment, and only then it gets published; (c) there's a funding issue - I have somehow got funding to spend many months of multiple people on this, but who's paying the other independent teams to do that?; (d) it's not a given that there are two other teams capable of doing the exact same research, e.g. if you want to publish a study on the results of an innovative surgery procedure, it's plausible that there aren't (yet!) any other surgeons worldwide who are ready to perform that operation, that will come some time after the publication; (e) many types of science really can't get a separate confirmation - for example, we have only one Large Hadron Collider, you can't re-do archeological digs, event-specific on-site sociological data gathering can't really be repeated, etc; so you have to take the data at face value.
Hopefully it is clear that that data is useless without some written text explaining what it means. Given that for hundreds of years the accepted way of presenting that explanatory text was by writing papers, I don't see any reason to abandon that. Tweaking our strategies for replication (after a description of the experiment has been published!) and reputation don't seem to contradict that.
For me a better solution would be to properly incentivise replication work and solid scientific principles. If repeating an experiment and getting a contradictory result carried the same kudos as running the original experiment then I think we'd be in a healthier place. Similarly if doing the 'scientific grind work' of working out mistakes in experimental practice that can affect results and, ultimately, our understanding of the universe around us.
I think an analogy with software development works pretty well: often the incentives point towards adding new features above all else. Rarely is sitting down and grinding through the litany of small bugs prioritised, but as any dev will tell you doing that grind work is as important otherwise you'll run in to a wall of technical debt and the whole thing will come tumbling down.
You have big companies making Billions with the work of relatively poorly paid nerds. But as soon as you make it possible for the nerds to claim all the profits of the work then you have a whole class of people whose job is to insert themselves as middlemen and ruin it for everyone, both customers and developers.
So basically the aim is to limit the degree to which you can privately profit from science, and expand the amount of science you can easily build on. You still get enough incentives for progress, the benefits accrue to society as a whole, and competition and change is enabled without powerful gatekeepers controlling too much in their own interests.
One perspective is that, “knowledge generation wise,” the current system really does work from a long term perspective. Evolutionary pressure keeps the good work alive while bad work dies. Like that [Top Institution] paper: if nobody else could reproduce it, then the ideas within it die because nobody can extend the work.
But that comes at the heavy short term cost of good researchers getting duped into wasting time and bad researchers seeing incentives in lying. Which will make academia less attractive to the kind of people that ought to be there, dragging down the whole community.
Due to career and other reasons, there is a publish or perish crisis today.
Maybe we can do better by accepting not everyone can publish ground breaking results, and it's okay.
There are lots of incompetent people in academia, who later go to upper positions and decide your promotions by citation counts and how much papers you published. I have no realistic ideas how to counter this.
We need to create new a social institution of Anti-Science, which would work on other stimuli correlated with the amount of refuted articles. No tenures, no long-term contracts. If anti-scientist wished to have income it would need to refute science articles.
Create a platform allowing to hold a scientific debate between scientists and anti-scientists, for a scientist had an ability to defend his/her research.
No need to do anything special to prosecute, because Science is a very competitive, and availability of refutations would be used inevitable to stop career progressions of authors of refuted articles.
Data manipulation generally doesn't happen by changing values in a data frame. It's done by running and rerunning similar models with slightly different specifications to get a P value under .05, or by applying various "manipulations" to variables or the models themselves for the same effect. It's much easier to identify this when you have the code that was used to recreate whatever was eventually published.
I personally favor requirements which call for bundling raw datasets with the "papers". The data storage and transmission is very cheap now so there isn't a need to restrict ourselves to just texts. We should still be able to check all of the thrown out "outliers" from the datasets. An aim should be to make the tricks for massaging data nonviable. Even if you found your first data set was full of embarassing screw ups due to doing it hungover and mixing up step order it could be helpful to get a collection of "known errors" to analyze. Optimistically it could also uncover phenomenon scientests thought was them screwing up like say cosmic background radiation being taken as just noise and not really there.
Paper reviewing is already a problem but adding some transparency should help.
You don't have to convict people for full-on fraud. If you are caught using an obvious mistake in your favor or using a weak statistical approach, the punishment can be you are not allowed to apply for grants with a supervisor/co-PI/etc who's role is to prevent you from following that "dumb" process in the future.
Something like a well funded ten year campaign to do peer review, retrying experiments and publishing papers on why results are wrong.
I have a co-worker who had a job than involved publishing research papers. Based on his horror stories it seems like the most effective course of action is to attack the credibility of those who fudges results.
There will always be cases of fraud if someone deeps deeply enough into large institutions. That doesn't actually indicate that there is a problem.
Launching in to change complex systems like the research community based on a couple of anecdotes and just-so stories is a great way not actually achieving anything meaningful. There needs to be a very thorough, emotionally and technically correct enumeration of what the actual problem(s) are.
Research is heavily funded because people believe it's something more than a random claim making machine. You say governments should assume research is wrong and then try to replicate any claim before acting on it. But you end up in a catch 22: if the research community is constantly producing wrong claims there's no reason to believe your replication attempt is correct, as it will presumably be done by researchers or people who are closely aligned.
Additionally inability to replicate is only one of many possible problems with a paper. Many badly designed studies that cannot tell you anything will easily replicate. A lot of papers are of the form "Wet pavements cause umbrella usage". That'll replicate every single time, but it's not telling you anything useful about the world. Merely trying to fix things with lots of replication studies thus won't really solve the problem.
"Wet pavements cause umbrella usage" is something where I'd want to see your specific examples because it's easy to get a correlational study of that nature but very hard to design a causal one. The correlational studies are usually accurate and often useful for other research.
So a lot of people only notice this in the rare cases when someone within the academy decides to write about it. This can make it seem like science is self correcting, but it appears in reality it's not. When measured quantitatively there is no real improvement over time. Alvaro de Menard has written extensively on this topic and presented data on the evolution of P values over the last decade:
Additionally as he observes at the end of his essay, the problems are due to bad incentives, so the only true changes can come from changes to incentives. However those incentives are set by the government. Individual scientists cannot themselves change the incentives. The granting agencies are entirely oblivious to the problems and the scale of their ambition is in no way equal to the scale of their problem:
"If you look at the NSF's 2019 Performance Highlights, you'll find items such as "Foster a culture of inclusion through change management efforts" (Status: "Achieved") and "Inform applicants whether their proposals have been declined or recommended for funding in a timely manner" (Status: "Not Achieved") .... We're talking about an organization with an 8 billion dollar budget that is responsible for a huge part of social science funding, and they can't manage to inform people that their grant was declined! These are the people we must depend on to fix everything."
Firstly, the problem here is not an epidemic of scientists who feel too financially insecure to do good work. Many of the worst papers are being written by people with decades-long careers and who lead large labs. Their funding is very secure. They are doing bad work anyway for other reasons, sometimes political or ideological, more often because doing bad work results in attention, praise and power. Or sometimes because they don't know how to explain their chosen question, but don't want to admit that scientifically they failed and don't know where to go next.
Secondly, as you already realized your proposal relies on identifying which scientists have a proven track record, but the whole problem is that science is flooded with fraudulent/garbage claims which are highly cited ("proven") and which were written by large teams of supposedly respectable scientists at supposedly respectable institutions. Any metric you can invent to decide who or what has a proven track record is going to be circular in this regard. To Rumsfeld the problem, we are surrounded by "unknown knowns". You say this is an open question but to me that's a fatal flaw.
So the problem is actually the inverse. You say at the end, well, scientists who can fund their own work are an exception. Obviously in most cases scientists don't need to do this, they can also be funded by companies. Most computer science research works this way. Better CPUs and hardware is done almost entirely by companies. AI research has been driven by corporate scientists, and so on. In contrast academic funding comes primarily from government agencies that distribute money according to the desires of academics. This means a tiny number of people control large sums of money, and they are accountable to nobody except themselves. There are no systems or controls on academic behavior except peer review, which is largely useless because the peers are doing the same bad things as everyone else.
Viewed from an economic perspective academia is a planned reputation economy. The state is the source of all resource allocation decisions (academics being effectively state employees in most fields). There's also a deeply embedded Marxist worldview: universities have no working mechanisms to detect fraud, because of an implicit assumption that deep down when market forces are gone everyone is automatically honest and good. The hierarchy is stagnant; the same institutions remain at the top for centuries. A good reputation lets them select the people with the reputation for being smart (e.g. by school grade), so that reputation accrues to the institutions, which lets them keep selecting intake by reputation and so on. Supposedly Oxford and Cambridge are the best UK universities, they always have been, and they always will be. In a competitive, free market economy they would face competition and other institutions would seek to figure out what their secret is and copy it, like how so many companies try to copy the Toyota Way. In science this doesn't happen because there's nothing to copy: these institutions aren't actually different.
This implies a simple solution, just privatize it all. It would be wrenching, just like it was when the USSR transitioned to a market economy, just like it was when China (sort of) did the same. But one thing the 20th century teaches us is that you can't really fix the problems of a planned economy by tinkering with small reforms at the edges. The Soviets weren't able to fix their culture with glasnost and perestroika. They eventually had to give up on the whole thing. Replacing the current reputation economy with a real economy, with all the mechanisms that economic system has evolved (markets, prices, regulators, court cases, fraud laws etc), seems like a more direct and obvious approach to making things better, even if it may sound extreme.
My envisioned solution is similar to yours, here. But rather than "privatize science", which I think most people will interpret as "move to industrial research", my rallying cry is a little more like "hey scientists, stop depending on public funding, let's find creative ways to get the science done."
I also like to point out that money is often not the missing factor as much as community. This has always been true. Mendel discovered genetics by experimenting on beanstalks in his garden at his monastery. It cost him very little to do it, and he only stopped the research when his community told him to stop wasting time on beans and get back to the important accounting work that impacted the church's politics at the time.
You might think that maybe science was cheap in the past, but that today you need lots of money, to get the lab equipment, etc. However, science always has a cutting edge of cheaply evaluable questions. We recently hosted a DIY Synthetic Biologist (currently on the homepage of https://invisible.college) who showed the actual costs of his work, and his laboratory equipment was far, far, cheaper than the "cost" of his time. We can get far more science done with "amateur scientists" (remember that "ama" means love, and an amateur scientist is one doing science for love) by creating a scientific community outside the institutions for interested parties to work together, pool their brainpower and resources, and come up with great novel work.
And if anyone else agrees with me on this, please let me know so we can forces. I'm firstname.lastname@example.org, and am doing work on invisible.college.
I absolutely agree that a lot of science can be done very cheaply. Some of the most impactful papers were done by people who weren't in an institutional framework, even in the modern era (Satoshi being an obvious example). Additionally it seems most of the really problematic fields are ones where the budget gets dispersed over large number of people writing very cheap low budget papers, hence millions of social science papers with tiny sample sizes.
I'm a big supporter of industrial research though. Many great papers come out of industrial labs. Modern computing is practically defined by such research. The big advances all seem to come from big corporate labs (Xerox PARC, Bell Labs, Google, DeepMind, IBM, Sun, Microsoft, etc). The research is powerful because it's funded by people who expect some sort of meaningful results and supervise the work to ensure it doesn't go completely off the rails. Academic institutions have developed this totally hands off attitude that makes research more or less unaccountable to any standard beyond "will it get published", which in turn can be rephrased as "are the claims interesting".
> The big advances all seem to come from big corporate labs
That's an interesting claim, and I'd encourage you to find some statistics to verify this hypothesis, because in my experience, that doesn't ring true.
From my subjective perspective, it seems that academic and industrial research labs innovate at roughly the same rate per-capita. I was a PhD student when Microsoft was dominant, hiring the best faculty from all top-4 CS schools (CMU, Berkeley, MIT, Stanford), and they certainly produced a lot of papers, and did seem to dominate conferences, but the actual innovation in computing came from Apple and startups, which did not have "research labs". Microsoft, including its giant industrial research lab, certainly was not the driver of innovation in computing!
And here are some numbers to back that up: Microsoft's R&D budget in 2011 was 10x the budget of the entire NSF -- for all sciences. Yet, Microsoft was clearly not producing more than 10x the scientific output of all NSF-funded academic science.
So it would help to have some statistics for the claim that industrial research innovates more than academic research. They certainly pay more, and often hire more people, but per-capita they don't seem any more productive or healthier than academics.
Apple does very little research, in the conventional scientific sense we're discussing here, I think that's pretty uncontroversial. They produce few if any papers. They are (or were, under Jobs) very good at coming up with new ideas that strongly appeal to the buyer and which got them a reputation for innovation, but which probably wouldn't be considered clever enough to be research papers. At least not top tier papers.
For example, exposé is a widely imitated feature and was considered very innovative at the time, but it wouldn't be seen as serious computer science. The iPhone is/was widely considered innovative but had basically no new research tech in it, given that capacitive touch screens weren't developed by Apple. It was just a really nicely implemented mobile computer. Actually the innovations in the iPhone are nearly all packagings of tech developed by third party firms that Apple then buys or buys exclusivity rights too. At least, that's true in my view.
Microsoft's R&D budget I think is also a victim of definitions. Software firms normally report all product development as R&D, right? I think these days they may even report datacenter builds as R&D. We can see this on Microsoft's investor website:
"In addition to our main research and development operations, we also operate Microsoft Research. Microsoft Research is one of the world's largest computer science research organizations"
i.e. the kind of university type "scientific" research we're discussing here is only a sideshow in Microsoft's R&D budget.
You're right to call me out though; I don't have any stats to prove that industrial research does more than academic research. It's not a statistical argument to begin with, just my own own perception ("all seem to"). I read a lot of CS papers and the best ones have corporate email addresses at the top - the second best, a mix of corporate and university addresses, the third best, only university addresses. If you asked the man on the street to name the biggest innovations in computing in the past 20 years they'd probably say things like, uh, smartphones, YouTube, AI, blockchain, etc etc. All things that have little connection to universities, with AI being the closest but it was Google that revived that whole field and has been pushing it forward ever since. Neural nets weren't receiving much investment by the academic community before that.
Anyway, that's CS. CS really isn't the problem here. The pseudo-science is elsewhere.
In most cases, peer reviewers will just assume that authors claiming the "code is available" means that a) it is reproducible and b) it is actually there.
As a counter example, this recent splashy paper
claims the code is available on github, but the github version ( https://github.com/jameswweis/delphi ) contains the actual model only as a Pickle file, and contains no data or featurization.
So clearly, the peer reviewers didn't look at it.
I think it's more important for reviewers to read the source, the same way one would read an experimental protocol and supplementary information, mainly checking for discrepancies between what the paper claims is happening and what is actually being done. In the above example, a reviewer reading the code would have spotted that the model isn't there at all, even though it runs fine.
In fact, I'd actually go further and question what kinds of errors could possibly be caught be running the same software that the authors did? Any accidental bugs will remain, and any malicious tampering with the experiment data is exceedingly unlikely to be caught even with a careful audit of the code.
When I was in graduate school papers from one lab at Harvard were know to be “best case scenario”. Other labs had a rock solid reputation - if they said you could do X with their procedure, you could bet on it.
So basically we treated every claim as potential BS unless it came from a reputable lab or we or others had replicated it.
There are several levels of peer review. I've definitely been a reviwer on papers where the reviewers requested everything required and reproduced the experiment. That's extremely rare.
Some wider questions would be: Are there similar problems in Mathematics/physics versus the life sciences/other social sciences? Are there the same kind of problems across different fields of study?
Also i wonder if replication issues would be less severe if there was a requirement to publish the software and raw data that any study is based on as open source / data. It is possible that a change in this direction would make it more difficult to manipulate the results (after all it's the public who paid for the research, in most cases)
The only way to fix replication issues is to give financial and career incentives for doing replication work. Right now there are few carrots and many sticks.