Hacker News new | past | comments | ask | show | jobs | submit login
Open-sourcing DeepFocus, an AI-powered system for more realistic VR images (fb.com)
92 points by lainon 6 months ago | hide | past | web | favorite | 11 comments




I really don't understand why they couldn't just use traditional blur techniques. They say:

> These fast but inaccurate methods of creating “game blur” ran counter to Half Dome’s mission, which is to faithfully reproduce the way light falls on the human retina.

... but traditional Z-based blur is no less faithful than their overall "whole screen at single shifting plane of focus depending on gaze" approach. All of computer graphics is "more to do with cinematography than realism" anyway, realism is nice but if you have to choose between "looks realistic" and "looks good" you go for "looks good" every time.

Also, as others have mentioned, getting sufficient resolution for really high quality VR basically requires foveated rendering, at which point the bits that you're blurring are, by definition, not what you're looking at (since you're rendering at lower resolution outside of the fovea) and a blur algorithm that needs four GPUs to run in realtime is a complete waste of resources.

Edit: Watched the video. Their 'Circle of Confusion' map is literally just 'focus_Z - pixel_Z'. I really don't see what deep learning adds here.


It’s research so it’s good to know how to do it really well so in 10 years you can throw it in. Chasing photo realism is sisyphisian.

Using this now for anything like a commercial game is way past the point of diminishing returns. In their paper the SNR between their solution and default Unity’s blur shader is a couple percent while the computational costs are extremely steep.

At 512x512 resolution their fast solution adds 3ms to render time of each frame using, if I’m reading it right, 4 GPUs, which is a drop in 10fps. That’s mind boggling slow for real-time graphics, especially VR which is even more resolution and frame rate sensitive.

If instead of using the default Unity dof blur you really focused on tuning a blur algorithm, which no one really bothers with because it’s not worth the effort, you could probably make up much of the difference.

But really once you have eye tracking, status quo dof blurring will be more than good enough.

The case they mention where someone would be in VR all day makes sense, but let’s all hope we never reach that dystopic vision.


It's more about estimating the edges, and that the underlying depth estimate is always going to be noisy and approximate. For example, the Pixel 3 does something similar for Portrait mode [1].

Don't most modern real-time DOF techniques rely on some sort of G-buffer with primitive IDs? In the article it explicitly says they wanted to just work with RGBZ so that existing titles can just be post-process blurred.

Finally, I think the time required is probably closer to 50ms per frame per GPU. If you assume that the content was running at about 60 fps (16ms/frame) on a single GPU, and linearly sped up via magic to 240 fps (4 and change ms), then the delta to 60 fps is 12 ms across four GPUs or about 48 ms on a single GPU.

[1] https://ai.googleblog.com/2018/11/learning-to-predict-depth-...


This means Oculus has this technology working. Which means foveated VR headsets are coming soon.


This seems unrelated to foveated rendering unless I'm missing something. Foveated rendering reduces the complexity of rendering in areas outside what the fovea is looking at, because they are blurry to us anyway. This adds blur to out-of-focus objects; but it seems obvious that in order to appreciate the effect you'd have to be looking at them via your fovea.


The two are related. In order to infer the user's focal length, you need to know what they are looking at. Which requires eye tracking, which is also a requirement of foveated rendering


No it doesn't. You're ignoring hardware.


Hardware meaning the headsets which were mentioned in GP post?


What's a "DeepFocus enabled headset"? Presumably eye/focus tracking?


Yes. My reading is it would be a headset with eye tracking - such as the half dome prototype.

We also don’t know how much hardware assistance these headsets will have. So I it could additionally refer to a headset with the ability to process or assist with this technique running on the hardware itself. This seems likely given the speed of eye movements and the need to reduce latency to generally imperceptible levels.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: