Hacker News new | comments | show | ask | jobs | submit login
Deep image reconstruction from human brain activity (2017) (biorxiv.org)
288 points by hardmaru 37 days ago | hide | past | web | favorite | 85 comments

Thanks for this!

Acknowledging that I know nothing about psychology or neuroscience...

It's interesting that they were not able to extract good images from the test when they just asked the subjects to imagine the pictures. I speculate that is because the vast majority of people are able to recognize images of things, but not produce those images.

For a simple example, see this article about people trying to draw a bicycle. Before clicking, try to draw a 2D representation of a bicycle -- the wheels, the frame, pedals, handlebars, etc. You'd be surprised how difficult it is. https://road.cc/content/blog/90885-science-cycology-can-you-...

I would be very interested to see them repeat the experiment on e.g. artists who specialize in photorealism, or asking an expert on a very particular thing to imagine that particular thing (e.g. ask a violinist to imagine a violin). I wonder if that would yield more recognizable results than asking a random grad student to visualize a lion's face.

I think what's maybe misleading here is that while there may be an accurate representation of a bicycle somewhere in someone's brain, that does not mean they're able to consciously access it and, e.g., recreate it through a task of complex motor coordination etc. (in order to produce a drawing, I mean).

However, if you are using an fMRI, as in the paper, all you need is the accurate representation somewhere (and you need to know where that somewhere is of course).

Another example: let's say you look at a picture of cheetah. Now you try to imagine it; what appears to you is a vague sort of resemblance but nowhere near as clear as when you were looking directly at it. But, just because the prompting for you to imagine the image didn't cause an accurate recreation in consciousness, doesn't mean it didn't occur elsewhere, perhaps at some earlier/lower-level stage of visual processing.

I don't think the "draw a bicycle" task involves complex motor coordination. It's drawing 2 circles and ~10 straight lines. Doesn't need to be photorealistic, just in the correct configuration.

For example, I cannot visualize a 13-sided polygon accurately, but I can draw one pretty easily without having any drawing skills.

True. I hesitated to use that as an example because of the potential for confusion—but to clarify, I just meant that the motor skills are an example of another thing being involved beyond a clear representation somewhere, not that they are actually the limiting factor for reproducing the drawing.

A more realistic problem in the case of the bicycle drawing has to do with the procedural knowledge they use for drawing. For instance, maybe they approach it analytically, top-down, breaking it into major components to draw first, and those into sub components. Well, in order to do that, they need explicit knowledge of that hierarchy; they may have an accurate representation of the image somewhere, but they can't bring it directly into consciousness, and they didn't break it into that hierarchy while they had it around. So now what's left to them is to attempt to produce the hierarchy from an imperfectly recalled image, which leads to the strange configurations.

Just a guess—but I wouldn't be surprised to learn it's something on those lines.

there is also evidence this ability to conjure imagery in our minds simply does not exist for some: https://en.wikipedia.org/wiki/Aphantasia

which leads me to wonder if the capability exists on a spectrum and is stronger or weaker in some. A stronger capability might correlate to a stronger 'signal' ...?

I am highly skeptical of Aphantasia, as it's impossible to evaluate what someone else believes they see. I think it's much more likely that Aphantasia is people (like me) who make sharp distinctions between what they can imagine (rotating colors on shapes etc etc) and what they see when looking at something for real - e.g - the detail of a loved one's face.

I challenge anyone who believes they have a highly developed mind's eye to draw such a face purely from memory.

I have this! I found out from a random daytime TV program and did some online tests. I didn’t realise until that point that other people can make clear images of things in their head.

Weirdly I’m actually quite good at drawing something from memory like a bicycle (I love Pictionary!) but I can’t actually make a image of it in my head.

Brains are strange.

Agreed, from the description it sounds closer to "people who are aware of the missing parts in the imagined image".

Never heard of this "condition" before. It may apply to me. I can somewhat imagine a static thing or a face but trying to "zoom in"/"rotate" or imagining smaller details completely ruins the illusion.

Do people actually see images of things in their head?

I can close my eyes and think of something, but I won't perceive anything more than a linear description of it.

At best, I can use these visual descriptions to draw an outline in my head, but it instantly fades out if I don't continuously re-draw it.

Kind of like this: https://i.imgur.com/0zuPIPV.gifv

Can't reply to your other comment, presumably due to comment depth -- but yes, I did all of those vividly and with ease, even with my eyes open, except for counting the number of icons in my imagined phone screen. Somehow I can imagine the phone screen as a whole, filled with icons, but actually counting them feels weird, like how reading written text inside a dream is weird.

Until I read [0], I had never thought about it and just assumed that everything else was like me. Hah!

[0] https://www.lesswrong.com/posts/baTWMegR42PAsH9qJ/generalizi...

Supposedly, most people do. There was an article on hn about "aphantasia" (the lack of mental imagery) not to long ago, but I can't seem to find it now. Google turns up quite a few hits though.

I can't help but be a little sceptical of the concept though. I believe there's probably a spectrum but maybe not as wide is claimed. Based on the comment sections of basically any article on the subect, I wonder if there isn't something like imposter syndrome going on. Where the majority of people dont think they have as sharp mental imagery as the average person.

Aphantasia is definitely real. My inner visual world has always consisted of the view of the inside of my eyelids. Thinking for me is an almost purely verbal process, in an inner voice that I don’t percieve as sound in any way. I can navigate in my mind, and describe accurately places I’ve been, but there’s never an image, only a kind of spatial perception, combined with knowledge (of materials, colors etc.)

I thought this was how everyone’s mind worked, until I met my ex, who is on the opposite end of the scale. She described her inner world as sometimes more vivid and detailed than the real world. I thought she was almost unique in this, until I started asking other people and realised that most people see images in their mind, of varying detail and vividness. Some can easily project onto their visual input and, for example, put a moustache or wings on someone. I’ve also come to learn that many people hear their own voice when they think (and many, non-schizophrenic, people hear other voices too), and can also imagine smells and tastes.

My episodic memory is affected by my aphantasia, as well as my people recognition skills. I can find it hard to recognise someone if they’ve changed their hair, grown a beard or changed their appearance in some other way.

I do dream though, vivid dreams, and am quite good at recalling them. But again the memory of the dreams, like other memories, is not visual.

I’m quite fascinated with this subject, and have some thoughts of picking up where I left off on my psychology master to be able to study it more. It’d be interesting to see if there are correlates in big-5 personality, iq, mental health etc.)

As far as I know, aphantasia is very real. But you can train yourself to see images in your head.

See: image streaming by Win Wenger http://winwenger.com/imstream.htm

I'd be very curious to see if you are able to see simple things in your mind's eye after practising this for a week or two

I have no issue conjuring clear images when closing eyes. I guess it depends how your brain is wired and this is probably why we have ways to refer people who are good at drawing, painting, etc. They probably have a mix of clear visualization and very good hand/finger coordination.

- Can you imagine the color of the floor, walls and ceiling at work?

- Can you imagine the color of the eyes and hairs of your coworkers?

- Can you imagine the home screen of your phone and see how many application icons can fit in a row and column?

- Can you imagine Google's logo and see the color of each letter?

- Can you imagine your car manufacturer's logo and draw it?

1) I can do that easily, and even can imagine the paint texture for the walls.

2) Hair for all of them, but I can only picture the eyes of the people who had unusual colors (blue, gray, green). Maybe I never paid much attention if they had normal brown eyes.

3) Row yes, but I wasn't sure about the column (4 or 5).

4) Yes.

5) easiest task on the list.

I'm pretty sure less than 1% of people know the colors of Google's letters.

Google has colored letters?

What you described is exactly how I experience things in my head as well, including that redrawing part.

I've noticed though that as I've spent years learning Japanese characters, I am somewhat better at visualizing them in my head. I can see (or "feel", as it's not really visual exactly in the same sense as seeing, although they are not just concepts but have spatial locations) whole subcomponents at a time instead of having to plot each line.

Yep, I can visualize images of some things, especially things I am familiar with or that left an impression on me (incomplete and with inaccuracies, I'm sure). I think I am also fairly visual/spatial in my thought processes.

Interesting. What are your dreams like? Do they look like abstract pictures (lanes? shapes?) or have no picture at all?

Dreams are more visual, although I can't recall ever seeing colors. There's no texture or details, just general outlines.

>Do people actually see images of things in their head?

Yes, and Forbidden Planet "plastic educator" comes to mind :)

Basing this on my rough memory of abnormal and developmental psych, the article and example you’re presenting is a type of Rey-Osterrieth Complex Figure [https://en.wikipedia.org/wiki/Rey%E2%80%93Osterrieth_complex...].

Some people may be able to produce a bicycle in more details than others but that depends on how developed and healthy their brain is to use their executive functions. Without knowing all of the subjects medical history—early dementia, strokes, all the good stuff- and current health, it’d come as no surprise that even consciously we cannot produce true representations of what we saw.

Police sketches from eye witnesses may be a good thing to look at when thinking of producing images from memory. Some people have better brain health and are able to utilize many different high and low-level functions to produce more accurate descriptions. Others not so much.

How does one improve brain health?

Harvard updated their Buzzfeedesque post on this topic [https://www.health.harvard.edu/mind-and-mood/12-ways-to-keep...] which spans from crossword puzzles to avoiding alcohol.

But to keep it simple: read, do puzzles, stay active, do not drink or do drugs heavily. ——

Everybody already has a genetic makeup that limits their health—especially the brain—to a preset level. Some people suffer from infantile amnesia—basically not remembering early developmental stages (2-4) and preadolescent (~10 y/o); other’s claim to remember their birth (which is silly since ability to form memories is 2 y/o+). Some have the luxury of having strong working memory^1—a fancy word for temporary memories—but suffers from lackluster short-term memory.

I’m sure you get the point I was trying to make with that mediocre explanation.

tldr: You can not change your brain to not have dementia if your genetic makeup says so. You can however delay the onset possibly by keeping your brain active on a daily basis.

[1] https://en.wikipedia.org/wiki/Working_memory

I don't really think that you can form memories only from when you are 2 years old. I have memories of when they removed the cast from my broken arm and I was probably 1 year and some months.

In certain situations, especially ones related to trauma or an altering event, it may be possible to remember it faintly. But there are many factors that may have helped you remember this.

Have you spoke about it at great lengths when older? Is it often brought up about how they took the cast off? If so, you’re looking at the idea of explicit (declarative) memory—the storing of facts and events. If over the years you are constantly reminded of this event, it is more than likely that you’re trying to generate a (possible semi-accurate) image or memory of that particular event.

As far as your second comment goes, mother tongue and languages are different all together. When you learn as a yongue age, your cerebral cortex is still developing and your mind is absorbing everything like a sponge. As you learn these words, continue to use them, that language and those procedures are going to get stored in your long-term memory and then you’ll be able to recite all of these words and form sentences without thinking twice—something we would call implicit memory which helps us with procedure memories (riding a bike, writing, etc.).

From what little I know of memory recall, there is a very good chance that is a false memory if it happened before people generally start being able to form memories.

so even all the words of the mother tongue that you learn when you are a baby are false memories? I'm really skeptic, and I am pretty sure that mine is not a false memory.

Feynman wrote about this in one of his books. it's really hard to understand how someone else understands something. but you can say, well imagine it's orange and spherical and kind of hairy, and they'll have a picture in their mind you can kinda work with.

From "What Do You Care What Other People Think":

I was a kid growing up in Far Rockaway, I had a friend named Bernie Walker. We both had “labs” at home, and we would do various “experiments.” One time, we were discussing something — we must have been eleven or twelve at the time — and I said, “But thinking is nothing but talking to yourself inside.

“Oh yeah?” Bernie said. “Do you know the crazy shape of the crankshaft in a car?”

“Yeah, what of it?”

“Good. Now, tell me: how did you describe it when you were talking to yourself?”

So I learned from Bernie that thoughts can be visual as well as verbal.

Are you sure you're not thinking of the "hairy green ball thing" from his discussion of visualizing topology examples?


> It's interesting that they were not able to extract good images from the test when they just asked the subjects to imagine the pictures. I speculate that is because the vast majority of people are able to recognize images of things, but not produce those images.

It seems reasonable to me. To use a computer as an analogy, the set of signals produced by a webcam recording a video to a hard drive are not going to be the same as the ones of the video playing from the hard drive on to the screen.

Or another way: the data in a frame buffer is not the same as the data in a png file.

For those who don't see it, the full paper is linked on the right side (PDF):


The code and data for these amazing results have been generously made available here:


The code is well-commented Python and a pleasure to read.

Very impressive! I was quite skeptical until I read your comment. Now I'm just a little skeptical.

This looks awesome, but the methodology and experimental setup look pretty complicated and just on that basis alone it seems like there'd be some risk of bias or overfitting to creep into the analysis... I'm not criticizing the researchers or the experimental design, it's just a danger in any complicated study like this.

Bottom line, I'm definitely interested in this paper, but want to wait and see if others can replicate the results before getting overly excited.

This is impressive, and certainly technology like this will probably find lots of very good uses, but I can't help imagining(!) dystopian societies where "thoughtcrime" can actually be monitored and "made actionable".

A world in which "you have nothing to hide" turns into "you have nothing you can hide" is immensely disturbing.

Google already knows what you will be thinking about in the next minutes; no fMRI needed.

I bet they will soon replace the search bar by just a button.

When most people are connected to the cloud we will no longer need public security cameras, there will be someone on site automatically uploading what they saw.

Question: does this setup allow to decode images from the brains of different people than the one(s) it was trained for? How well does that work compared to using the same person?

Also interesting to know how performance differs between e.g. twins, people in the same family, people of different cultures, occupations, etc.

Reminds me of this project from 2011: https://www.youtube.com/watch?v=nsjDnYxJ0bo

If I were a billionaire, one of the projects I'd do would be to apply this sort of technology at scale, upon volunteers who were witness to poorly recorded historic events. WWII, Beatles concerts, JFK assassination, 9/11... all of them have been recorded from angles/perspectives that exist only in peoples' memories, and the longer we wait, the more of those memories will be lost.

You’re vastly overestimating the human brain’s ability to accurately recall first hand accounts. It’s actually kind of shocking how bad eyewitnesses are at remembering details.

Isn't it a Russian saying - "He lies like an eyewitness"

When we build neural networks to decode our own neural networks...

We already build neural networks to predict how our own neural networks react to input (marketing, ie facebook).

The obvious next step is to combine that with the findings in the article.

Ie: Create a VR (or AR) headset that can both present an audio-visual experience to the user, WHILE having enough sensors to accurately measure the users brain state (and other bodily states, such as heart rate, blood pressure, etc).

That will place a game developer in a position to predict what events a game can create that would create the greatest possible engagement in a given gamer. They can time rewards, challenges, esthetical or sexual stimuli in the way that generates the greatest possible control of the player's future behaviour.

In the hands of the greatest data scientists, this will allow them to create a hithertdo unprecedented level of addiction, which can then be monetetized through subscriptions, micro transactions, embedded adverticements or just by selling the data.

In the hands of governments, this can be used to pacify any population, drive behaviour in desired directions, or create support (even demand) for given policies.

In the hands of parents, it can greatly help them teach their children whatever they want to teach them. (Be it morality, religion, self-discipline, or whatever)

In the hands of a great psychologist, it would be able to cure most mental disorders less severe than outright psychosis.

How will we face this? Will anyone outside the elite be able to prevent this being used against them?

neuroscientists have been using "ai" for basically ever. except to them it's not "ai," it's just mathematical tools for data analysis.

Never mentioned AI. Simply noting that the mathematical tool used here is inspired by the brain.

Do you remember that project that mapped where words are "stored" in the brain? I don't have it right now, but it was something like that. They used a neural approach to map the word vectors in the brain.

That sounds like the brain atlas created with subjects listening to stories in an fMRI machine.

Here's a Guardian article about it: https://www.theguardian.com/science/2016/apr/27/brain-atlas-...

And a link to the paper: https://www.nature.com/articles/nature17637

Deep^2 learning

From the paper:

> We used the Caffe implementation (Jia et al., 2014) of the VGG19 deep neural network (DNN) model (Simonyan & Zisserman, 2015; available from https://github.com/BVLC/caffe/wiki/Model - Zoo) . All visual images were resized to 224 × 224 pixels to compute outputs by the VGG19 model . The VGG19 model consisted of a total of sixteen convolutional layers and three fully connected layers.

They used a VGG-19 network for the feature detection, so in my basic understanding the results of this could already be improved by switching to a ResNet.

Not necessarily. While ResNets are optimized for the task of achieving the best image classification accuracy, it is no guarantee that the features extracted from it will transfer well to other tasks such as this one, especially when the task is to generate images. In fact, it has been shown that pre-trained VGG networks are a lot more useful than pre-trained ResNet for other tasks other than image classification, such as style transfer.

Very interesting. For my curiosity I did some searching on VGG vs ResNet.

* ResNets were better at feature extraction for image clustering [0]

* One of the trade-offs of ResNets seems to be their relative complexity to VGG [1]

* It surprises me that ResNets aren't significantly faster to train based on the large reduction of FLOPs (from the ResNet paper VGG-19 had 19.6 billion FLOPs vs ResNet-34 with 3.6 billion FLOPs) - I think people just train deeper ResNets e.g. ResNet-50

[0]: https://medium.com/@franky07724_57962/using-keras-pre-traine...

[1]: https://www.reddit.com/r/MachineLearning/comments/6e6mlf/d_i...

You might enjoy this recent article on Distill:


They discuss VGG vs non-VGG architectures in the context of style transfer in Section 2, which was interesting to me.

Excellent, digging down into that leads to this Reddit post [0]

> One thing they noticed was that using features from a pretrained ImageNet VGG-16/19 CNN from 2014 (4 years ago), like the original Gatys paper did, worked much better than anything else; indeed, almost any set of 4-5 layers in VGG would provide great features for the style transfer optimization to target (as long as they were spread out and weren't exclusively bottom or top layers), while using more modern resnets (resnet-50) or GoogLeNet Inception v1 didn't work - it was hard to find sets of layers that would work at all and when they did, the quality of the style transfer was not as good. Interestingly, this appeared to be true of VGG CNNs trained on the MIT Places scene recognition database too, suggesting there's something architectural going on which is not database specific or peculiar to those two trained models. And their attempt at an upscaling CNN modeled on Johnson et al 2016's VGG-16 for CIFAR-100 worked well too.

[0]: https://www.reddit.com/r/MachineLearning/comments/7rrrk3/d_e...

I now realise that that article comes back to your very own twitter post [0], thanks for your patience!

[0]: https://twitter.com/hardmaru/status/954173051330904065

It would be interesting to do this experiment with people who had high abilities in drawing lifelike pieces.

If you actually look at the images they are not very impressive especially if you know about retinotopy in the visual cortex. This means that there is a mapping between coordinates on the retina and coordinates in visual cortex. I'm not convinced this is doing much other than picking up on this long established fact to produce very vaguely similar images. Seriously look at how different the input image and the generated image are!

I guess it makes sense that HN is skeptical about results more closely related to the community's expertise but it's a pretty crazy leap from being able to reconstruct something vaguely similar to what a person is currently seeing vs. trying to read someone's mind. I still have seen very little evidence that such a thing could at all be possible with FMRI. The spatial and temporal resolution are just far too low. The more everyone is impressed with results like this the more we delay the hard work of developing tools that actually have a shot at doing something like that.

Next: All you need to do to make the next billion $ movie is your imagination, literally (and a 100m$ helium cooled MRI machine, and 25$ worth of AWS credits).

The critical component is your imagination. If your imagination is at that level, you are likely to be successful in Holywood,even today.

But I think the greatest utility of this tech is not in the movie industry, but in the gaming industry.

Imagine a photo realistic version of sims where your family and friends know EXACTLY how to react to maximize your sense of meaning, happiness, enjoyment in precisely the way that is needed in order to prevent you from leaving the game?

MRI is crazy expensive, but you only really need $4M for the scanner, so doable for indie rom-come as well as blockbusters!

MRI market is very small and regulated as of today. Wider adoption of reasonably simpler devices would change everything, just recall how crazy expensive computers were in 1950s.

Among useful applications I would imagine, for example, the fast image-based information retrieval - e.g. useful for recommendation systems (in other words - Netflix could invest in a 'imagine anything and we find you a movie like that').

Maybe, but I think if it happens at all, it's going to use some other form of neuroimaging. MRI needs a lot of power, a shielded room, strong magnetic fields (which are both annoying and dangerous). Many of these are physical constraints rather than just...where the market happens to be.

Hmm. Do different humans produce relatively the same brain images for a given scene? If not, I’d imagine training this would be a real PITA.

fMRI data is very noisy, and that's reflected in the generated images. They're not at all recognizable, unless they're sampled from a very small class like simple geometric shapes. I doubt it's possible to do much better, with fMRI.

I was wondering the same thing, hopefully someone in the comments will know more about it.

Why a PITA? Just show a known movie to the person while scanning.

Blog post that has some info on the new GAN-based mind reading ideas from last year (including this one):


Does the 50GB of data include the MRI measurements from both during training and testing?

Wouldn’t that training set be valuable!

But how do I ice a data angel?

Wow, I'm calling a hoax. They produce images of dubious value, and for whatever reason, contaminate the results by mutating them with the Deep Dream stylization.

This is as bad as the dead salmon results.

EDIT: You can flag all you want, but look at this playback of the reconstruction:


There's nothing useful happening here.

The code and data is here:


Not a hoax.

Your comparison to Deep Dream is quite apt, for good reason that you clearly don't realize-

Deep Dream is an attempt to "enhance patterns in images via algorithmic pareidolia"- Pareidolia being a "psychological phenomenon in which the mind responds to a stimulus". So yes, this project is affected by a real life "deep dream".

That's exactly what they say they're doing. That's not "contamination" - that's the whole point.

And that's exactly why this is garbage research. Point deep dream at an interference pattern of pure noise, after training it on a set of target images.

Don't be surprised when it "finds" the image, pulled from raw static that contained no signal.

Not convinced this is as junk as you claim it to be. From the abstract:

> While our model was solely trained with natural images, our method successfully generalized the reconstruction to artificial shapes, indicating that our model indeed reconstructs or generates images from brain activity, not simply matches to exemplars.

If I am reading that correctly it is reconstructing images that the subjects had not seen before.

Not my field of expertise but on page 15 they claim " In our reconstruction analysis, we used a pre-trained DGN which was provided by Dosovitskiy & Brox" So if the deep dream generator is not trained on their actual (limited) training data but instead on a massive dataset of 'all photos that are real", I would say the study is valid.

Don't feed trolls with throwaway accounts.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact