
Researcher questions years of his own work with a reexamination of fMRI data - EndXA
https://today.duke.edu/2020/06/studies-brain-activity-aren%E2%80%99t-useful-scientists-thought
======
jonnycomputer
An implicit assumption of commentary here is that no one has bothered to test
how reliable task and resting state fMRI results are. But that simply is not
the case:

[https://scholar.google.com/scholar?hl=en&as_sdt=0,47&q=%22te...](https://scholar.google.com/scholar?hl=en&as_sdt=0,47&q=%22test+retest+reliability%22++fmri&scisbd=1)

Results are: from poor to excellent, depending on task, sample, methodology.

Speaking as someone who had conducted longitudinal fMRI analyses, I would say
two things:

1) its been fairly well established that within-individual correlations of
BOLD is generally higher than between-individual

2) within-individual correlation can be weak to moderate, but group-level
patterns are very often quite robust. For example, in a cognitive control task
you are going to see presupplemental motor area, dorsal anterior cingulate,
anterior insula, and so on. For risk tasks, you are going to find that
anterior insula correlates with risk; for learning tasks, you are going to
find striatum signals for reward prediction errors.

I encourage people to visit Neurosynth, which performs automated metanalyses.
Here, for example, are results for "prediction error" after accounting for
activation that occurs generally during tasks:

[https://neurosynth.org/analyses/terms/prediction%20error/](https://neurosynth.org/analyses/terms/prediction%20error/)

and here for "interference"

[https://neurosynth.org/analyses/terms/interference/](https://neurosynth.org/analyses/terms/interference/)

(default mode network and task-positive networks anti-correlate in activity,
another super-reliable result)

~~~
jonnycomputer
I do not want to leave the impression that there are no issues with fMRI
analysis. Its hard to do right.

[https://www.nature.com/articles/s41586-020-2314-9.pdf](https://www.nature.com/articles/s41586-020-2314-9.pdf)

------
Bukhmanizer
This is pretty close to my background. The article is fairly dramatically
written. It’s not as if nobody has been looking at reliability and
discriminability until now. It’s a big topic in the field with both task and
rest fMRI.

~~~
conjectures
Mine too and I agree.

There is a big spread in the quality of work in the field. From thoughtful
analyses of large datasets, to pretty bad examples of p-hackery with push-
button software.

A root cause of problems is people asking scientific questions of fMRI data
whose answers would lie at a much finer spatiotemporal resolution than the
medium can support.

~~~
phkahler
>> A root cause of problems is people asking scientific questions of fMRI data
whose answers would lie at a much finer spatiotemporal resolution than the
medium can support.

Reminds me of the "mirror neurons" nonsense from a while back.

~~~
dr_dshiv
What was nonsensical about mirror neurons, if I might ask?

------
EndXA
The original study:
[https://journals.sagepub.com/doi/abs/10.1177/095679762091678...](https://journals.sagepub.com/doi/abs/10.1177/0956797620916786)

Abstract:

> Identifying brain biomarkers of disease risk is a growing priority in
> neuroscience. The ability to identify meaningful biomarkers is limited by
> measurement reliability; unreliable measures are unsuitable for predicting
> clinical outcomes. Measuring brain activity using task functional MRI (fMRI)
> is a major focus of biomarker development; however, the reliability of task
> fMRI has not been systematically evaluated. We present converging evidence
> demonstrating poor reliability of task-fMRI measures. First, a meta-analysis
> of 90 experiments (N = 1,008) revealed poor overall reliability—mean
> intraclass correlation coefficient (ICC) = .397. Second, the test-retest
> reliabilities of activity in a priori regions of interest across 11 common
> fMRI tasks collected by the Human Connectome Project (N = 45) and the
> Dunedin Study (N = 20) were poor (ICCs = .067–.485). Collectively, these
> findings demonstrate that common task-fMRI measures are not currently
> suitable for brain biomarker discovery or for individual-differences
> research. We review how this state of affairs came to be and highlight
> avenues for improving task-fMRI reliability.

------
brainmapper
As an fMRI practitioner, I just want to point out that this problem has
nothing to do with fMRI. It has to do with the weaknesses inherent in the most
common methods for designing fMRI experiments, and for analyzing and modeling
the data. The SNR in fMRI depends heavily on the level of blood pressure,
arousal, attention and other factors that vary day-to-day. Any analysis that
doesn't account for that variability will be unreliable.

Methods for designing experiments, analyzing and modeling fMRI data that are
robust to this variability are available. The problem is that most people in
the field don't use them.

~~~
trombonechamp
Which methods are you referring to?

~~~
brainmapper
To quantify the value of a method it is useful to consider the amount of
information that the method recovers from the data stream, the prediction
accuracy of the resulting models, generalization ability outside of the
conditions used to fit the model and decoding/reconstruction accuracy. For all
four of those criteria, the best approach is to use linearized encoding models
that estimate an FIR filter for each voxel separately.

------
LargoLasskhyfv
On the other hand you have the reverse engineering of what people see.

[1] [https://news.berkeley.edu/2011/09/22/brain-
movies/](https://news.berkeley.edu/2011/09/22/brain-movies/)

[2] [https://nuscimag.com/your-brain-on-youtube-fmris-reverse-
eng...](https://nuscimag.com/your-brain-on-youtube-fmris-reverse-engineer-the-
visual-experience-2650eb055e8f)

[3]
[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4941940/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4941940/)

[4] [http://longnow.org/seminars/02018/oct/29/toward-practical-
te...](http://longnow.org/seminars/02018/oct/29/toward-practical-telepathy/)

While this reminds me a little bit of all the back and forth in nutrition, I
simply enjoy the ride and think of

[5]
[https://en.wikipedia.org/wiki/Brainstorm_(1983_film)](https://en.wikipedia.org/wiki/Brainstorm_\(1983_film\))

:)

~~~
brainmapper
The movie reconstruction from brain activity work was from my lab, thanks for
the shout out!

The problem with many methods of analyzing and modeling functional MRI data is
that the signal-to-noise varies hugely across time, across individuals and
across brain regions within an individual. Unfortunately, the most common
methods of analyzing and modeling fMRI data do not incorporate any principled
method for dealing with this SNR variability.

What makes this frustrating from a practitioner's point of view is that we do
have methods for analyzing and modeling these data that can account for these
uncontrolled SNR changes. The problem is that most people don't use these
methods. (My lab has pioneered many of these techniques, and that is why we
could produce those compelling decoding results.)

------
hliyan
Does anyone know the mechanism that increases blood flow to more active parts
of the brain? Not being an expert, I'm imagining expanding/constricting blood
vessels controlled by a signalling pathway from... what?

~~~
nvrspyx
The following is my understanding, but I could be wrong:

When neurons activate, they release a neurotransmitter (typically glutamate)
that then binds to nearby astrocytes. Astrocytes then increase their
intracellular calcium levels which in turn causes vascular dilation of the
nearby area, thus increasing blood flow to that area.

~~~
hliyan
Thanks, this was what I was looking for. So I guess this pathway is not very
predictable, hence the variations in subsequent measurements of the same
subject.

~~~
jrumbut
Also consider aligning the images to process them all with a single model, the
different sized/shaped heads in the world and correcting for their different
angles and little bits of movement. That's not even considering differences
between brains.

I am continually amazed at the level of automated preprocessing tools to deal
with this alignment problem. You can play around with fMRI data using some
popular packages without thinking too hard about it and produce visualizations
almost as nice looking as the ones in the article.

~~~
brainmapper
The most sensitive fMRI methods don't do alignment at all, they do all data
processing in the individual subject's brain space.

------
pbhjpbhj
Presumably activity from background tasks, subconscious processes, will be a
changing overlay to whatever current conscious tasks are happening. And, again
presumably, other overlays like mood exist?

Poor repeatability seems inherent, the starting brain state will be different,
the background processes will be different, the accompanying thoughts will be
different?

Is this reanalysis looking at something else beyond this: Like, are people
using no brain areas in common across some repeated tasks??

~~~
wintermutesGhst
Well, the issue is that if you accept that brain state will always be
different then there isn't much predictive power in the measure.

The conceit was always that you could measure it across a bunch of people and
find the commonly active areas across enough datasets. Even with different
baselines, the areas critical to the task would elevate above that baseline.

This paper finds that even in a single person, activity (above the baseline)
is poorly correlated across recording sessions. They use a technique called
intraclass correlation to measure this:
[https://en.wikipedia.org/wiki/Intraclass_correlation](https://en.wikipedia.org/wiki/Intraclass_correlation)

~~~
SubiculumCode
But does not contest whether the mean averages for the sample are reliable,
which has been demonstrated repeatedly.

------
rbraffin
I spent 3 years doing research on (f)MRI. This problem is deeply linked with
the models used for the analysis. Basically, the BOLD signal is a new
exponential fit everytime. There are no subject parameters considered. On the
other hand, task-based analyses always consider the stimulus a discrete step
function for the deconvolution.

For example, whenever we enter a room with an annoying clock tick-tacking, we
observe it doesn't affect the person native to the room as much. Or whenever
we watch an add a second time, it goes by more quickly. Such cases are not
accounted for in current models.

------
sul_tasto
My brother-in-law suffered a severe heart attack about a month ago. He’s 40.
His wife told us he showed no brain activity, other than that in his upper
spinal cord, in the MRI’s done at 1 and 2 weeks after the event. He woke up a
day after his second MRI and is currently in rehab. I never saw the scan
results, but I don’t understand how they could be so wrong.

~~~
neuronexmachina
I'm under the impression that fMRI for predicting coma outcomes is still quite
experimental: [https://www.statnews.com/2015/11/11/brain-scans-coma-
recover...](https://www.statnews.com/2015/11/11/brain-scans-coma-recovery/)

------
tlb
The article claims that images of the same people done months apart have poor
correlation. It doesn't say how closely they're correlated on the same day,
but presumably a poor correlation there would have been noticed before. If
there's some sort of activity shift on month time scales, that would be
extremely interesting.

------
projektfu
I have felt that there’s circular reasoning going on with fMRI results, where
the parts of the brain light up for something, showing that the process is in
that part of the brain, and then they light up again with something else,
which shows that thing is the same process.

~~~
jonnycomputer
A brain region can calculate more than function; so that part isn't
problematic. But interpretation of BOLD signal is inherently tricky because
its like measuring heat off of a processor; it might tell you how hard its
working, but not its working on; but if you can modulate the temporal dynamics
of the processing, then maybe you have something work with.

One approach is to have a generative normative model of a mechanism (e.g.
temporal-difference learning) verified by lower-level research (unit
recordings in animal models), fit parameters to the model based on the task
behavior (e.g. learning rate) and then find the correlates to that (e.g. to
the subjective reward prediction errors as they occur in the task at time of
decision feedback). The benefit here is you have already a plausible mechanism
that can recover the behavior, and you are finding changes in BOLD signal that
track those. Doesn't solve the problem entirely, but its better than just
correlation with whatever.

------
seemslegit
Didn't that already happen with the "brain activity in a dead salmon" affair ?

------
v77
There's a Wired article from almost fifteen years ago taking the piss out of
fMRI.

------
neurostudent
This should be less of a problem as fmri resolution continues to increase

~~~
JackFr
Why? It doesn't seem to be a problem of resolution.

------
SCAQTony
Is the mesh in the photo the atom network shaping the protein?

------
dzdt
Not mentioned: the dead salmon study of 2009 that outed a bunch of fMRI
studies as using incorrect statistics, and which went on to win an ignoble
prize.

[1] [https://blogs.scientificamerican.com/scicurious-
brain/ignobe...](https://blogs.scientificamerican.com/scicurious-
brain/ignobel-prize-in-neuroscience-the-dead-salmon-study/)

~~~
SubiculumCode
It is not mentioned because it is not relevant. The dead salmon experiment
highlighted that some researchers were failing to correct for multiple
comparisons sufficiently. This article is talking about the test-retest
reliability of the fMRI measure in individuals.

edit: Instead of down voting, perhaps moderators might write how that
experiment is actually relevant.

~~~
seesawtron
I agree: dead salmon study outlines the flaws of Statistics. This study
outlines the flaws/limitations of poorly formed experimental paradigms to
which fMRI is not immune. Its like comparing applies to oranges.

------
dirtydroog
How gutting for the researcher. It's very admirable for him to be so open
about it all

~~~
disgruntledphd2
To be fair, if this paper holds up then he's going to be the center of a
scientific controversy which will increase citations, or it's going to become
a required cite in pretty much every fMRI paper, which will increase
citations.

So from a career perspective, this is actually pretty good (and wonderful
work, even if it wasn't going to be good for his career).

------
basicplus2
How deliciously wonderful.. we humans can still have some mystery about us.. i
find this to be great news.

~~~
w_t_payne
We still have a whole heap of mystery about us - especially where the brain is
involved. :-)

------
op03
Good. It's been as dumb as pointing a thermal camera at a CPU and memory chips
and getting excited about hotspots.

~~~
redis_mlc
One of the better HN analogies. Well-played!

------
tompccs
"Phrenology (from Ancient Greek φρήν (phrēn), meaning 'mind', and λόγος
(logos), meaning 'knowledge') is a pseudoscience which involves the
measurement of bumps on the skull to predict mental traits."

[https://en.wikipedia.org/wiki/Phrenology](https://en.wikipedia.org/wiki/Phrenology)

