
FastMRI initiative releases neuroimaging data set - moneil971
https://ai.facebook.com/blog/fastmri-releases-neuroimaging-data-set/
======
axegon_
> Apply for Access

> The application process includes acceptance of the Data Sharing Agreement
> (found below) and submission of an online application form. The application
> must include the investigator’s institutional affiliation and the proposed
> uses of the data. NYU fastMRI data may be used for internal research or
> educational purposes only as described in the data use agreement and may not
> be redistributed in any way without prior permission.

> Read and agree to the data use agreement below to apply for access.

Seriously?!?!?!?!?! You call that open source?!??!?! OK, let's leave the
semantics aside for a moment. Yes, a dataset like that would be very
interesting and I would happily play around with it for a few weeks and see
what I can come up with. If it's something worth exploring further, I'll
happily document it and open source it. But that isn't something I can really
estimate without being to explore the data and see first hand what it is. For
that reason alone, myself (and dozens off the top of my head) will roll their
eyes and pretend like it doesn't exist.

False claims, over-hyping with no real understanding and bureaucratic crap
like this is what is slowing everyone and everything down, the sooner people
understand it, the better.

~~~
99052882514569
>Yes, a dataset like that would be very interesting and I would happily play
around with it for a few weeks and see what I can come up with.

Spare-time tinkerers aren't really the intended audience here:

>>The neuro data set will allow researchers to test their models with data
from additional machine types, new sequence types, and different coil
configurations that were not present in the previously released fastMRI knee
data set. Radiologists also look for different diagnostic properties (such as
contrast in texture between different neural tissue) in brain MRIs. These
differences present an interesting and challenging machine learning problem to
solve and will help researchers develop models that generalize to more
clinical settings.

It's unlikely in the extreme that anyone in that audience will be stopped by a
simple data sharing agreement. And it's also unlikely in the extreme that
anyone outside that audience will know what to do with a bunch of raw k-space
MR datasets. Domain knowledge is an absolute necessity with this data.

~~~
axegon_
Domain knowledge, while useful is not a necessity - I'm fine with contacting a
specific domain expert for help and even pay for his or her services(with my
own money at that), though none of the ones I've ever contacted ever wanted
anything for the time they spent on a problem I presented them with. To put it
as question: do I really need to explain how open source works in theory and
practice on a place like hackernews? Especially given that most of the world's
infrastructure in practice runs thanks to the collaboration of millions who
have built most of what we use in their "spare time" as you call it.

Same applies to DNN's(if not more so) - take any large DNN with the papers,
data set and code to the author and ask them to give an explanation as to why
it works as well as it does while it performs terribly on a different data
set, even a similar one. "Well yeah, it's curve fitting which works here but
doesn't work there". Why? ¯\\_(ツ)_/¯

My rant concerns a different problem - if you have such data and you want to
share it, just go ahead and do. 1 of every 10000 might do something useful
with it but we aren't talking about nuclear experiments where something can
blow up, are we? Worst case scenario someone's cpu or gpu might overheat, big
deal. Just ditch the entire bureaucracy crap, we have enough of that as it is
in our daily lives.

~~~
SubiculumCode
This is human subject data and there are good reasons to set legal limits on
how that data is used, even if anonymized. Your "rant" is not well informed.

------
dontreact
I will reiterate my comment on these types of projects. The regulatory pathway
established by the fda for these types of products is woefully inadequate and
they are very very hard to properly validate.

I think any application of deep convolutional neural networks should be
alongside a radiologist. If we speed up scans and make up for it with convnets
it is very hard (practically speaking: impossible) to properly validate that
they will not hallucinate away rare abnormalities. It will also be impossible
for radiologists your spot errors like this in the wild because of the
reduction in quality of the scan.

What happens when the scanners change their behavior in some subtle way that
is unaccounted for by FastMRI? It could start erasing a ton of subtle
abnormalities and this would not be possible to check for since the original
scan will be lower quality.

~~~
dmead
most places call this type of thing "clinical decision support". nobody in
their right mind wants to remove the human doctors from the process... yet.

~~~
dontreact
And yet, for the reasons I stated, this is exactly what FastMRI aims to do.
Speed up the scan. There will be no way for Radiologists to oversee the
reconstruction and make sure subtle abnormalities are preserved.

~~~
dangom
Ideally a proper DNN reconstruction would learn the mapping from the raw-space
to image-space. See, for example:
[https://www.nature.com/articles/nature25988](https://www.nature.com/articles/nature25988)
.

There is just too much redundancy in MRI data, and initiatives such as FastMRI
are fundamental for us to learn what the limits are of feasible acceleration.
Also, some MRI scans take forever and cannot be used in vulnerable populations
because of, e.g., breath holds, the need to stand still, etc. The image
quality, perhaps counter-intuitively, in some situations improves with
acceleration.

~~~
dontreact
Can you explain why mapping from raw space gets rid of any of the concerns I
raised?

It’s interesting research for sure. I hope it stays far away from actual
clinical use for a while, for the reasons I highlighted. I’d like to see
convnets work alongside radiologists for a while and prove robustness to
dcanner changes in the wild before we start shoving them deep in the stack
where radiologists can’t review what’s happening.

------
ChrisFoster
Current MR physicist / data scientist here. There seems to be a lot of
misapprehension in this thread.

First, this work is about taking data in the sensor domain ("k-space") and
reconstructing it into an image. Doing this with partial k-space data and
hand-coded heuristics is a _completely standard_ part of the MRI research
agenda and has been for quite some time. See, for example,
[http://mriquestions.com/k-space-
trajectories.html](http://mriquestions.com/k-space-trajectories.html).
Further, several of these techniques have already made it into routine
clinical work, and this acquisition-side stuff generally happens before the
radiologist even sees the image (reliable acquisition is in the interaction of
radiographer with the scanner manufacturer's software).

There's also various claims here that seem to imply learned reconstruction
inherently implies the risk of hallucinations without recourse. Naturally, one
should be careful about this, but it's just a matter of careful cross
validation: hold out examples of abnormal anatomy for the test set. There's
other ways to attack this problem too: training can be done partly or mostly
on synthetic data because we have reasonably good forward models of the
physics. In this case, one could choose a wide variety of arbitrary synthetic
anatomies during training, to further ally the fear of always hallucinating
the "typical human brain" from any scan.

Slow acquisition and image artifacts in MRI are a fact of life for people in
the field and I believe there's huge scope for improvement if we had more
intelligent reconstruction and acquisition. Ideally the reconstruction would
feed dynamically back into the acquisition to gather more context as
necessary; the MR machine is, after all, one giant programmable physics
experiment. This is already done in a limited way, but in what I've seen it
relies on a lot of hand-coded heuristics. And guess what's the logical step
after hand-coded heuristics? Yes, learned models where you objectively
optimize for a final result, rather than hand-coding based on a few examples.

Final note - publicly releasing human data is a massive effort in data
cleaning and careful anonymization. Not to mention that the acquisition of
each sample is extraordinarily expensive. So bravo to these guys for going to
the effort.

------
lvs
This is a typical misapplication of machine learning. It's important to
realize what information can possibly be learned by a trained network. In this
case, the only thing that can be learned is which components of Fourier space
can be ignored for a given imaging problem. Such a question is far more
rationally addressed by a deterministic algorithm, if the goal is to speed up
acquisition for any specific anatomy. But to generally assume that an imaging
protocol can ignore parts of frequency/phase space without generating
artifacts is not only wrong, but very dangerous for patients. I can easily
play with Fourier space and generate the appearance of pathological conditions
that don't exist -- or disappear ones that do! Not good!

~~~
nwah1
If it is so easy, as you say, then if you can release examples where you do
that using their training data, then you would probably get some acclaim, and
I personally would applaud your effort.

~~~
lvs
I'll spend time on this if I can, but for the moment just imagine that in
frequency space, where MRI acquisition occurs, all I need to do to blank out
an imaging anomaly is discard spatial frequencies that correspond to the
principal spatial frequencies of the anomaly. This is analogous to blurring
your speech by notch filtering the principal frequencies of your voice. With
appropriate filtering, I can make your voice nearly indistinct from background
noise, so that it appears not be there.

------
y-c-o-m-b
As someone with an undiagnosed neurological illness, it would be fun if I
could run my MRI backups through it.

~~~
abrichr
I've been thinking about this type of use case recently. I'm curious, how many
MRI scans do you have? How do you store and view them? Have you applied any
computer vision techniques to them?

Thanks!

~~~
y-c-o-m-b
Sorry for the delayed response. I have multiple MRIs of my brain and the
entirety of my spine, most of them at 3T. I got them stored as ISO copies of
the CD they made for me. I have no idea where to begin by applying computer
vision. If you can point me in that direction, I would be happy to try.

EDIT: As for viewing, they come with a proprietary viewer loaded on the CD. I
do have the ability to export them as JPG image stacks though.

~~~
abrichr
Thanks for the info! Regarding using computer vision, the question to ask is:
what do you want to know about the content of your scans? For example, if you
have multiple scans over time, do you want to see how the structures are
changing? Or if it's just one or two points in time, do you want to know the
names of the structures?

I've been working in medical imaging and deep learning for years, and have
recently become disillusioned by the technology's potential to disrupt the
radiology industry. But I wonder if there aren't alternative use cases for
e.g. educational purposes. I'd love to know more about what you wish you could
do or know! Please feel free to email me.

------
andbberger
Fantastic!! Accelerating MRI with ML is an idea I've had in my little idea
book for years and I'm delighted to see it getting some mainstream attention!

It's a serious technical challenge but the benefits could be enormous.

IIRC the vast majority of the cost of an MRI is the amortized cost of the
imager, so faster scans should hopefully directly reduce the cost to patients,
perhaps to the point that regular full-body MRI scans for preventative
healthcare could be feasible.

~~~
Waterluvian
> perhaps to the point that regular full-body MRI scans for preventative
> healthcare could be feasible.

This is an interesting problem. Speaking with a number of physicians among the
family, there's a perspective that having a population performing all these
tests can possibly cause more problems than they solve. If the goal is to
holistically make a person healthy and happy, discovering diagnoses that have
no practical effects on someone's well-being can result in reducing their
well-being just by knowing about it. Humans are notoriously bad this way. Tell
someone their liver is somewhat different from an average adult liver and
they'll start assigning symptoms to it.

~~~
andbberger
I don't know, that line of thought seems myopic to me.

I'm more thinking, wow you could really do a lot in the way of automated
diagnoses if you had longitudinal data sets like that.

~~~
Waterluvian
Yeah. Certainly not saying it's an open and shut case on the right way to
test. Obviously we do whole-population preventative medicine for many things.

------
markmiro
Would they be filling in missing data with AI? Would they be doing something
similar to DeepFovea? If so, I would be concerned about accuracy.

[https://ai.facebook.com/blog/deepfovea-using-deep-
learning-f...](https://ai.facebook.com/blog/deepfovea-using-deep-learning-for-
foveated-reconstruction-in-ar-vr/)

------
Copenjin
Is this accessible only to people affiliated with some research institution as
it seems[1]?

[1] [https://fastmri.med.nyu.edu/](https://fastmri.med.nyu.edu/)

~~~
NYFB
No, it's available for all researchers, and you don't need to have a certain
affiliation. Thanks!

~~~
BubRoss
All 'researchers' or everyone?

~~~
moneil971
More details here:
[https://fastmri.med.nyu.edu/](https://fastmri.med.nyu.edu/)

~~~
BubRoss
This was a very simple question and instead of answering it directly you
linked a page with 2000 words of disclaimers and license agreements.

~~~
Tushon
From TFA: "The application must include the investigator’s institutional
affiliation and the proposed uses of the data. NYU fastMRI data may be used
for internal research or educational purposes only as described in the data
use agreement and may not be redistributed in any way without prior
permission."

Take 10s to read yourself instead of complaining about others not reading for
you. It's at the top of the page currently, under a heading, "Apply for
Access".

------
pflats
Huh, I was part of a study during my brain MRIs at NYU Lagone. I wonder if my
brain is in there.

------
d-d
... are these people aware these images of their bodies are publicly
available?

------
caycep
Was this at NEURIPS?

------
MaupitiBlue
Get ready to learn to code diagnostic radiologists.

