
Deep image reconstruction from human brain activity (2017) - hardmaru
https://www.biorxiv.org/content/early/2017/12/28/240317
======
hardmaru
More information on this thread:
[https://twitter.com/ykamit/status/1039672289413844993](https://twitter.com/ykamit/status/1039672289413844993)

~~~
NetOpWibby
Thanks for this!

------
gamegoblin
Acknowledging that I know nothing about psychology or neuroscience...

It's interesting that they were not able to extract good images from the test
when they just asked the subjects to _imagine_ the pictures. I speculate that
is because the vast majority of people are able to recognize images of things,
but not produce those images.

For a simple example, see this article about people trying to draw a bicycle.
Before clicking, try to draw a 2D representation of a bicycle -- the wheels,
the frame, pedals, handlebars, etc. You'd be surprised how difficult it is.
[https://road.cc/content/blog/90885-science-cycology-can-
you-...](https://road.cc/content/blog/90885-science-cycology-can-you-draw-
bicycle)

I would be _very_ interested to see them repeat the experiment on e.g. artists
who specialize in photorealism, or asking an expert on a very particular thing
to imagine that particular thing (e.g. ask a violinist to imagine a violin). I
wonder if that would yield more recognizable results than asking a random grad
student to visualize a lion's face.

~~~
lsh
there is also evidence this ability to conjure imagery in our minds simply
does not exist for some:
[https://en.wikipedia.org/wiki/Aphantasia](https://en.wikipedia.org/wiki/Aphantasia)

which leads me to wonder if the capability exists on a spectrum and is
stronger or weaker in some. A stronger capability might correlate to a
stronger 'signal' ...?

~~~
miguelrochefort
Do people actually _see_ images of things in their head?

I can close my eyes and think of something, but I won't perceive anything more
than a linear description of it.

At best, I can use these visual descriptions to draw an outline in my head,
but it instantly fades out if I don't continuously re-draw it.

Kind of like this:
[https://i.imgur.com/0zuPIPV.gifv](https://i.imgur.com/0zuPIPV.gifv)

~~~
ghkbrew
Supposedly, most people do. There was an article on hn about "aphantasia" (the
lack of mental imagery) not to long ago, but I can't seem to find it now.
Google turns up quite a few hits though.

I can't help but be a little sceptical of the concept though. I believe
there's probably a spectrum but maybe not as wide is claimed. Based on the
comment sections of basically any article on the subect, I wonder if there
isn't something like imposter syndrome going on. Where the _majority_ of
people dont think they have as sharp mental imagery as the average person.

~~~
Henk0
Aphantasia is definitely real. My inner visual world has always consisted of
the view of the inside of my eyelids. Thinking for me is an almost purely
verbal process, in an inner voice that I don’t percieve as sound in any way. I
can navigate in my mind, and describe accurately places I’ve been, but there’s
never an image, only a kind of spatial perception, combined with knowledge (of
materials, colors etc.)

I thought this was how everyone’s mind worked, until I met my ex, who is on
the opposite end of the scale. She described her inner world as sometimes more
vivid and detailed than the real world. I thought she was almost unique in
this, until I started asking other people and realised that most people see
images in their mind, of varying detail and vividness. Some can easily project
onto their visual input and, for example, put a moustache or wings on someone.
I’ve also come to learn that many people hear their own voice when they think
(and many, non-schizophrenic, people hear other voices too), and can also
imagine smells and tastes.

My episodic memory is affected by my aphantasia, as well as my people
recognition skills. I can find it hard to recognise someone if they’ve changed
their hair, grown a beard or changed their appearance in some other way.

I do dream though, vivid dreams, and am quite good at recalling them. But
again the memory of the dreams, like other memories, is not visual.

I’m quite fascinated with this subject, and have some thoughts of picking up
where I left off on my psychology master to be able to study it more. It’d be
interesting to see if there are correlates in big-5 personality, iq, mental
health etc.)

------
sdinsn
For those who don't see it, the full paper is linked on the right side (PDF):

[https://www.biorxiv.org/content/biorxiv/early/2017/12/28/240...](https://www.biorxiv.org/content/biorxiv/early/2017/12/28/240317.full.pdf)

------
randomdrake
The code and data for these amazing results have been generously made
available here:

[https://github.com/KamitaniLab/DeepImageReconstruction](https://github.com/KamitaniLab/DeepImageReconstruction)

The code is well-commented Python and a pleasure to read.

~~~
AlexCoventry
Very impressive! I was quite skeptical until I read your comment. Now I'm just
a little skeptical.

------
drcode
This looks awesome, but the methodology and experimental setup look pretty
complicated and just on that basis alone it seems like there'd be some risk of
bias or overfitting to creep into the analysis... I'm not criticizing the
researchers or the experimental design, it's just a danger in any complicated
study like this.

Bottom line, I'm definitely interested in this paper, but want to wait and see
if others can replicate the results before getting overly excited.

------
userbinator
This is impressive, and certainly technology like this will probably find lots
of very good uses, but I can't help imagining(!) dystopian societies where
"thoughtcrime" can actually be monitored and "made actionable".

A world in which "you have nothing to hide" turns into "you have nothing you
_can_ hide" is immensely disturbing.

~~~
amelius
Google already knows what you will be thinking about in the next minutes; no
fMRI needed.

I bet they will soon replace the search bar by just a button.

------
amelius
Question: does this setup allow to decode images from the brains of
_different_ people than the one(s) it was trained for? How well does that work
compared to using the same person?

~~~
amelius
Also interesting to know how performance differs between e.g. twins, people in
the same family, people of different cultures, occupations, etc.

------
xamuel
Reminds me of this project from 2011:
[https://www.youtube.com/watch?v=nsjDnYxJ0bo](https://www.youtube.com/watch?v=nsjDnYxJ0bo)

If I were a billionaire, one of the projects I'd do would be to apply this
sort of technology at scale, upon volunteers who were witness to poorly
recorded historic events. WWII, Beatles concerts, JFK assassination, 9/11...
all of them have been recorded from angles/perspectives that exist only in
peoples' memories, and the longer we wait, the more of those memories will be
lost.

~~~
JHonaker
You’re vastly overestimating the human brain’s ability to accurately recall
first hand accounts. It’s actually kind of shocking how bad eyewitnesses are
at remembering details.

~~~
JoeAltmaier
Isn't it a Russian saying - "He lies like an eyewitness"

------
lucidrains
When we build neural networks to decode our own neural networks...

~~~
a-dub
neuroscientists have been using "ai" for basically ever. except to them it's
not "ai," it's just mathematical tools for data analysis.

~~~
lucidrains
Never mentioned AI. Simply noting that the mathematical tool used here is
inspired by the brain.

------
icc97
From the paper:

> We used the Caffe implementation (Jia et al., 2014) of the VGG19 deep neural
> network (DNN) model (Simonyan & Zisserman, 2015; available from
> [https://github.com/BVLC/caffe/wiki/Model](https://github.com/BVLC/caffe/wiki/Model)
> \- Zoo) . All visual images were resized to 224 × 224 pixels to compute
> outputs by the VGG19 model . The VGG19 model consisted of a total of sixteen
> convolutional layers and three fully connected layers.

They used a VGG-19 network for the feature detection, so in my basic
understanding the results of this could already be improved by switching to a
ResNet.

~~~
hardmaru
Not necessarily. While ResNets are optimized for the task of achieving the
best image classification accuracy, it is no guarantee that the features
extracted from it will transfer well to other tasks such as this one,
especially when the task is to generate images. In fact, it has been shown
that pre-trained VGG networks are a lot more useful than pre-trained ResNet
for other tasks other than image classification, such as style transfer.

~~~
icc97
Very interesting. For my curiosity I did some searching on VGG vs ResNet.

* ResNets were better at feature extraction for image clustering [0]

* One of the trade-offs of ResNets seems to be their relative complexity to VGG [1]

* It surprises me that ResNets aren't significantly faster to train based on the large reduction of FLOPs (from the ResNet paper VGG-19 had 19.6 billion FLOPs vs ResNet-34 with 3.6 billion FLOPs) - I think people just train deeper ResNets e.g. ResNet-50

[0]: [https://medium.com/@franky07724_57962/using-keras-pre-
traine...](https://medium.com/@franky07724_57962/using-keras-pre-trained-
models-for-feature-extraction-in-image-clustering-a142c6cdf5b1)

[1]:
[https://www.reddit.com/r/MachineLearning/comments/6e6mlf/d_i...](https://www.reddit.com/r/MachineLearning/comments/6e6mlf/d_is_vgg_common_in_newer_research_or_is_resnet/)

~~~
hardmaru
You might enjoy this recent article on Distill:

[https://distill.pub/2018/differentiable-
parameterizations/#s...](https://distill.pub/2018/differentiable-
parameterizations/#section-styletransfer)

They discuss VGG vs non-VGG architectures in the context of style transfer in
Section 2, which was interesting to me.

~~~
icc97
Excellent, digging down into that leads to this Reddit post [0]

> One thing they noticed was that using features from a pretrained ImageNet
> VGG-16/19 CNN from 2014 (4 years ago), like the original Gatys paper did,
> worked much better than anything else; indeed, almost any set of 4-5 layers
> in VGG would provide great features for the style transfer optimization to
> target (as long as they were spread out and weren't exclusively bottom or
> top layers), while using more modern resnets (resnet-50) or GoogLeNet
> Inception v1 didn't work - it was hard to find sets of layers that would
> work at all and when they did, the quality of the style transfer was not as
> good. Interestingly, this appeared to be true of VGG CNNs trained on the MIT
> Places scene recognition database too, suggesting there's something
> architectural going on which is not database specific or peculiar to those
> two trained models. And their attempt at an upscaling CNN modeled on Johnson
> et al 2016's VGG-16 for CIFAR-100 worked well too.

[0]:
[https://www.reddit.com/r/MachineLearning/comments/7rrrk3/d_e...](https://www.reddit.com/r/MachineLearning/comments/7rrrk3/d_eat_your_vggtables_or_why_does_neural_style/)

------
Agnosco
It would be interesting to do this experiment with people who had high
abilities in drawing lifelike pieces.

------
dontreact
If you actually look at the images they are not very impressive especially if
you know about retinotopy in the visual cortex. This means that there is a
mapping between coordinates on the retina and coordinates in visual cortex.
I'm not convinced this is doing much other than picking up on this long
established fact to produce very vaguely similar images. Seriously look at how
different the input image and the generated image are!

I guess it makes sense that HN is skeptical about results more closely related
to the community's expertise but it's a pretty crazy leap from being able to
reconstruct something vaguely similar to what a person is currently seeing vs.
trying to read someone's mind. I still have seen very little evidence that
such a thing could at all be possible with FMRI. The spatial and temporal
resolution are just far too low. The more everyone is impressed with results
like this the more we delay the hard work of developing tools that actually
have a shot at doing something like that.

------
Elv13
Next: All you need to do to make the next billion $ movie is your imagination,
literally (and a 100m$ helium cooled MRI machine, and 25$ worth of AWS
credits).

~~~
mattkrause
MRI is crazy expensive, but you only really need $4M for the scanner, so
doable for indie rom-come as well as blockbusters!

~~~
snaky
MRI market is very small and regulated as of today. Wider adoption of
reasonably simpler devices would change everything, just recall how crazy
expensive computers were in 1950s.

Among useful applications I would imagine, for example, the fast image-based
information retrieval - e.g. useful for recommendation systems (in other words
- Netflix could invest in a 'imagine anything and we find you a movie like
that').

~~~
mattkrause
Maybe, but I think if it happens at all, it's going to use some other form of
neuroimaging. MRI needs a lot of power, a shielded room, strong magnetic
fields (which are both annoying and dangerous). Many of these are physical
constraints rather than just...where the market happens to be.

------
lwansbrough
Hmm. Do different humans produce relatively the same brain images for a given
scene? If not, I’d imagine training this would be a real PITA.

~~~
AlexCoventry
fMRI data is very noisy, and that's reflected in the generated images. They're
not at all recognizable, unless they're sampled from a very small class like
simple geometric shapes. I doubt it's possible to do much better, with fMRI.

------
DanielleMolloy
Blog post that has some info on the new GAN-based mind reading ideas from last
year (including this one):

[https://www.mindcodec.com/using-gans-brain-
reading/](https://www.mindcodec.com/using-gans-brain-reading/)

------
DoctorOetker
Does the 50GB of data include the MRI measurements from both during training
and testing?

------
m3kw9
Wouldn’t that training set be valuable!

------
daodedickinson
But how do I ice a data angel?

------
dreepDeam
Wow, I'm calling a hoax. They produce images of dubious value, and for
whatever reason, contaminate the results by mutating them with the Deep Dream
stylization.

This is as bad as the dead salmon results.

EDIT: You can flag all you want, but look at this playback of the
reconstruction:

[https://video.twimg.com/tweet_video/DSrWxhSVQAE2Xix.mp4](https://video.twimg.com/tweet_video/DSrWxhSVQAE2Xix.mp4)

There's nothing useful happening here.

~~~
sdinsn
The code and data is here:

[https://github.com/KamitaniLab/DeepImageReconstruction](https://github.com/KamitaniLab/DeepImageReconstruction)

Not a hoax.

Your comparison to Deep Dream is quite apt, for good reason that you clearly
don't realize-

Deep Dream is an attempt to "enhance patterns in images via algorithmic
pareidolia"\- Pareidolia being a "psychological phenomenon in which the mind
responds to a stimulus". So yes, this project is affected by a real life "deep
dream".

