
Yale researchers reconstruct facial images locked in a viewer’s mind - turing
http://news.yale.edu/2014/03/25/yale-researchers-reconstruct-facial-images-locked-viewer-s-mind
======
tokenadult
Cool result, but where is a publication showing the faces that were actually
tested and "reconstructed"? Many, many submissions to HN (like this one) are
press releases, and press releases are well known for spinning preliminary
research findings beyond all recognition. This has been commented on in the
PhD comic "The Science News Cycle,"[1] which only exaggerates the process a
very little. More serious commentary in the edited group blog post "Related by
coincidence only? University and medical journal press releases versus journal
articles"[2] points to the same danger of taking press releases (and news
aggregator website articles based solely on press releases) too seriously. I
look forward to seeing how this finding develops as it is commented on and
reviewed by other researchers in peer-reviewed publications and attempts to
replicate the finding.

The most sure and certain finding of any preliminary study will be that more
research is needed. Disappointingly often, preliminary findings don't lead to
further useful discoveries in science, because the preliminary findings are
flawed. If the technique reported here can generalize at sufficiently low
expense, it could lead to a lot of insight into the workings of the known-to-
be complicated neural networks of the human brain used for recognizing faces.

A useful follow-up link for any discussion of a report on a research result
like the one kindly submitted here is the article "Warning Signs in
Experimental Design and Interpretation"[3] by Peter Norvig, director of
research at Google, on how to interpret scientific research. Check each news
story you read for how many of the important issues in interpreting research
are NOT discussed in the story.

[1]
[http://www.phdcomics.com/comics.php?f=1174](http://www.phdcomics.com/comics.php?f=1174)

[2] [http://www.sciencebasedmedicine.org/index.php/related-by-
coi...](http://www.sciencebasedmedicine.org/index.php/related-by-coincidence-
only-journal-press-releases-versus-journal-articles/)

[3] [http://norvig.com/experiment-design.html](http://norvig.com/experiment-
design.html)

~~~
anigbrowl
Paper accessible via
[http://www.sciencedirect.com/science/article/pii/S1053811914...](http://www.sciencedirect.com/science/article/pii/S1053811914001633?_rdoc=1&_fmt=high&_origin=ihub&_docanchor=&md5=9ffa87934275edd7180b52f5e973f002)
if you have an academic login. Your points on experiemtnal design are good but
I feel you're grandstanding a bit here; the reason the press release is posted
is because it's so hard to get access to the text of scientific papers thanks
to the publishing industry, so citing essays about experimental design ends up
casting implicit aspersions on the quality of the authors' work, as if the
paper would be more publicly available but for some reticence on their part.

I would like to see drastic changes to the journal publishing model, but I
don't consider article inaccessibility to be correlated with flaws in the
findings. To bring up the latter out of frustration over the former is
poisoning the well of debate.

Incidentally, I found the paper in under 30 seconds by searching for the
journal name, the senior author, and a few technical terms, all of which were
in the press release and which gave me a correct first search result. It would
be nice if the press release also contained a DOI link and other identifying
information, but I can't really blame press departments for supplying
journalists with the information they want and omitting that which most of
them don't.

~~~
icegreentea
For those who don't have journal access, here's the part that you're probably
most interested in: [http://imgur.com/xzNwUTL](http://imgur.com/xzNwUTL)

The left column is the original image. Next to it is a non-neural
PCA/Eigenface reconstruction. Next to that are the reconstructions from a
variety of brain regions.

~~~
sillysaurus3
Would someone please upload the PDF for those of us who don't have academic
logins? The current state of affairs precludes people from thinking
critically. Instead we're forced to take publications at face value.

~~~
kevinchen
Here's the PDF. Science publishing sucks.

[http://cl.ly/3F140E1t0p0j](http://cl.ly/3F140E1t0p0j)

~~~
sillysaurus3
You're wonderful! Thank you so much.

------
chch
Although it's hard to tell from the images presented with the article, the
face generation looks like it could be similar to the techniques used in
Nishimoto et al., 2011, which used a similar library of learned brain
responses, though for movie trailers:

[http://www.youtube.com/watch?v=nsjDnYxJ0bo](http://www.youtube.com/watch?v=nsjDnYxJ0bo)

Their particular process is described in the YouTube caption:

The left clip is a segment of a Hollywood movie trailer that the subject
viewed while in the magnet. The right clip shows the reconstruction of this
segment from brain activity measured using fMRI. The procedure is as follows:

[1] Record brain activity while the subject watches several hours of movie
trailers.

[2] Build dictionaries (i.e., regression models) that translate between the
shapes, edges and motion in the movies and measured brain activity. A separate
dictionary is constructed for each of several thousand points at which brain
activity was measured. (For experts: The real advance of this study was the
construction of a movie-to-brain activity encoding model that accurately
predicts brain activity evoked by arbitrary novel movies.)

[3] Record brain activity to a new set of movie trailers that will be used to
test the quality of the dictionaries and reconstructions.

[4] Build a random library of ~18,000,000 seconds (5000 hours) of video
downloaded at random from YouTube. (Note these videos have no overlap with the
movies that subjects saw in the magnet). Put each of these clips through the
dictionaries to generate predictions of brain activity. Select the 100 clips
whose predicted activity is most similar to the observed brain activity.
Average these clips together. This is the reconstruction.

With the actual paper here:

[http://www.cell.com/current-
biology/retrieve/pii/S0960982211...](http://www.cell.com/current-
biology/retrieve/pii/S0960982211009377)

~~~
yeukhon
_Select the 100 clips whose predicted activity is most similar to the observed
brain activity. Average these clips together. This is the reconstruction._

Mind to explain to me why the right clip is inconsistent with the left clip?

[https://www.youtube.com/watch?v=nsjDnYxJ0bo](https://www.youtube.com/watch?v=nsjDnYxJ0bo)

Starting at 20-second I feel like the right clip is out of touch with the left
clip. between 20th and 22nd second I see at least three individuals rendered
from the reconstruction.

From 26th to the end of the clip I also see multiple individuals. The names
also look different from one another... When you say find the closest is that
an expected result?

~~~
chch
I'll give the disclaimer that this paper isn't in my field, and I'm merely an
observer. However, I'll do my best to explain, since it's a little unclear.

Based on my perspective, there were three sets of videos:

1) The several hours of "training" video, that they used to learn how the test
subject's brain acted based on different stimuli. (The paper (which I've only
skimmed) says 7,200 seconds, which is two hours)

2) 18,000,000 individual seconds of YouTube video that the test subject has
never seen.

3) The test video, aka the video on the left.

So, the first step was to have the subject watch several hours of video (1),
and watch how their brain responded.

Then, using this data, they predicted a model of how they thought the brain
would respond for eighteen million separate one second clips sampled randomly
from YouTube (2). They didn't see these, but they were only predictions.

As an interesting test of this model, they decided to show the test subject a
new set of videos that was not contained in (1) or (2), the video you see in
the link above, (3). They read the brain information from this viewing, then
compared _each one second clip_ of brain data to the predicted data in their
database from (2).

So, they took the first one second of the brain data, derived from looking at
Steve Martin from (3), then sorted the entire database from (2) by how similar
the (predicted) brain patterns were to that generated by looking at Steve
Martin.

They then took the top 100 of these 18M one second clips and mixed them
together right on top of each other to make the general shape of what the
person was seeing. Because this exact image of Steve Martin was nowhere in
their database, this is their way to make an approximation of the image (as
another example, maybe (2) didn't have any elephant footage, but mix 100
videos of vaguely elephant shaped things together and you can get close). They
then did this for every second long clip. This is why the figure jumps around
a bit and transforms into different people from seconds 20 to 22. For each of
these individual seconds, it is exploring eighteen million second-long video
clips, mixing together the top 100 most similar, then showing you that second
long clip.

Since each of these seconds has its "predicted video" predicted
_independently_ just from the test subject's brain data, the video is not
exact, and the figures created don't necessarily 100% resemble each other.
However, the figures are in the correct area of the screen, and definitely
seem to have a human quality to them, which means that their technique for
classifying the videos in (2) is much better than random, since they are able
to generate approximations of novel video by only analyzing brain signal.

Sorry, that was longer than I expected. :)

Edit: Also, if you see the paper, Figure 4 has a picture of how they
reconstructed some of the frames (including the one from 20-22 seconds), by
showing you screenshots whence the composite was generated.

~~~
chch
Alternatively, instead of reading all those words I just said, you can watch
[1], which is a video explanation of Figure 4 from the paper. :)

[https://www.youtube.com/watch?v=KMA23JJ1M1o](https://www.youtube.com/watch?v=KMA23JJ1M1o)

------
spikels
Too bad the reconstructed faces don't look anything like the presented faces
and I'm sure these two example are some of the best results.

I suspect the algorithm always outputs _some_ face generated by a
paramaterized face model (neutral net based?). Therefore even random output
would generate a face. Then with some "tuning" and a little wishful thinking
you might convince yourself this works.

Am I being too skeptical?

~~~
JackFr
I would say not skeptical enough.

Generating a result which looks like some combination of the inputs, has a
little bit of 'Wow' factor if you just look at the pictures and don't think
too hard about it. But ultimately its not an exercise in which they can be
wrong, and since the image returned is always going to be a face, they'll
always kind of be right.

An impressive result would be if they trained their system on the the 300
pictures, and then we're reliably able identify which picture the subject was
looking at. That would be quantifiable and testable, and I presume the result
would expose that this is all nonsense.

~~~
001sky
_" Since the image returned is always going to be a face, they'll always kind
of be right"_

So, non-falsifiable...?

------
aasarava
Let's stop with the "mind reading" warnings before they get too far out of
hand and consider what's really happening: Six subjects were shown a
"training" corpus of images first. Then shown new images. By comparing the
subjects' responses to the new images, the software in the study presumably
did its best to create composite images by pulling from the corpus.

So this raises many questions: How diverse were the faces in the training
corpus? How close were the new images to those in the corpus? When you're
looking at hundreds of images to train the machine, are you also unknowingly
being trained to think about images in a certain way? What happens when you
try to recreate faces based on the fMRI responses of subjects who didn't
contribute to the initial training set?

The implications of the last question are pretty interesting . If different
people have different brain responses to looking at the same image, does that
help us begin to understand why you and I can be attracted to different types
of people? Does it help begin to explain why two people can experience the
same event but walk away with two completely different interpretations?

~~~
JackFr
"If use terms like 'mind reading' in our press release we can get picked up by
a major news outlet, despite having no real notable scientific result."

Hypothesis confirmed!

------
freehunter
My first thought of this would be its use in constructing a image of a wanted
criminal, as a way to replace police sketch artists. When I viewed their
image, they're incredibly close, but I don't think they're quite there. I'm
really looking forward to seeing this improve as they've stated it will.

I thought the woman looked close enough to be able to identify, but the man
was not. Still, very impressive work.

~~~
scoot
_My first thought of this would be its use in constructing a image of a wanted
criminal, as a way to replace police sketch artists._

They're reconstructing faces the subject is viewing, not remembering. To
replace police sketch artists, the criminal would have to be present...

~~~
freehunter
Ah, I guess I misread. Thanks for the correction.

------
WildUtah
Soon we will be able to finally do away with that hoary old libertarian canard
that the state cannot judge you and punish you for what you think inside your
own head. Just imagine how harmony and true equality will blossom then!

------
TrainedMonkey
You know maybe people wearing foil hats were on to something after all.

In all seriousness this is a great advance in neuroscience that would help
understand many things about brain. On the other hand, potential for misuse is
enormous. Can you even prove you had been interrogated if such a device is
used on you?

~~~
bashinator
I think it's going to be quite a long time before fMRI machines get to the
point that you don't know one is being used on you. Room-temperature
superconductors are probably a requirement.

------
cma
This one is crazy:
[https://www.youtube.com/watch?v=SjbSEjOJL3U](https://www.youtube.com/watch?v=SjbSEjOJL3U)
(Mary Lou Jepsen on basically recreating video of what a person is watching
through fMRI)

------
bttf
We now have read access to a person's mind and its visual information when
dealing with faces. Naturally, write access cannot be far away ...

~~~
rquantz
The faces got into their brains in the first place because we already have
write access. Writing to the brain is easy -- it's reading what we've put
there that is difficult.

~~~
a-priori
It is possible to reconstruct an image by measuring the activation of V1
(primary visual cortex). The same technique works whether the image is
currently on the visual field, the person simply recalls an image from memory.

------
electrichead
I found it interesting that the researcher thought that there was no
possibility of receiving external funding for something like this. I would
have thought the opposite. In fact I can think of a bunch of companies who
would be only too happy to throw money at so ething like this.

------
spektom
I wonder whether this will work with different person, not the one who
participated in the machine learning process. Someone, who has different brain
activities when the same faces from the training set are shown. Is that
possible that our brains analyze the seen differently?

------
notastartup
so...we can eventually create games and movies by imagining it without the
need to do it by hand and input it into a computer

~~~
Houshalter
Possibly. One thing I'm worried about is that our brains might not be good
generative models. We remember the high level details, not a pixel by pixel
map of what we imagine. That's probably why humans aren't naturally good at
drawing. We can imagine a picture but trying to create an image on paper that
matches our imagination is quite difficult.

~~~
auxon0
True. However, if we are able to elevate this type of technology to that
level, people will train and be trained, to focus on such detail and learn to
control the output via the feedback loop. At first you'd probably "draw" like
a baby, but with enough practice, _some_ would become true artists, and the
average somewhere in the middle, but things would be recognizable and tada,
telepathy.

------
systematical
Great, now I can figure out who I slept with after blacking out.

