What's most impressive is they've separated the rendering algorithm (Mitsuba 2) from the retargeting framework (Enoki).
Enoki look amazingfrom their paper. It supports vectorized CPUs, JITTed GPUs, forward/back autodiff, nested array types.
Mitsuba 2 then expands that range even further by templating on key types and operations. For example, a materials color property might be represented by a RGB tuple for basic rendering, or an array that captures the full spectrum of light frequencies for a spectral renderer. They supply some example code, which is absurdly clean, as it's devoid of any specifics of storage and calculation, and focusses just on the high level algorithm.
They claim that the GPU impl is superior to PyTorch / TensorFlow in some regards as it can split the difference between eagerly sending every operation to the GPU, or processing the entire graph at once.
The amount of work and understanding to produce something like this is insane - they just casually mention how they've implemented a novel light transport scheme, an "extensive mathematical support library", and sophisticated python bindings.
My tool of choice is luxrender. ( https://luxcorerender.org/ ) can anyone explain how this is different?
I used it recently to make a chandalier's caustics for a realtime game. https://twitter.com/LeapJosh/status/1328520959968632833
Instead of taking a set of parameters, reflection coefficients, material colors etc, and generating an image, the inverse renderer takes an image and tries to derive the parameters.
As they mention in the video, this allows for a lot of interesting usages which you cannot really do in a forward renderer except by brute force.
The most striking to me is that it's not naturally lit. The light from the windows looks like someone put studio lights outside each one, rather than the sun and sky. Given that Mitsuba supports spectral rendering, an empirical spectral sky/sun model would improve the quality of the lighting by several orders of magnitude, and assuming such model is present in Mitsuba would be an easy change.
Secondly the glossy surfaces lack surface details and natural variation, they are unnaturally even and lacking imperfections.
There are other things as well but those are the main ones to fix IMHO. Fixing the lighting should be simple as mentioned, but adding the surface detail can be quite time consuming.
In this case the image is meant as an illustration, so the simplifications made are quite acceptable.
When using a physically based, photorealistic renderer and you model an indoor or studio scene as realistically as possible, you often find that it looks flat, dull or just bad.
And to fix it, you have to employ the similar tricks that professional photographers and cinematographers do, adding extra light sources to reduce harsh shadows, add highlights etc.
So by making the renderer more realistic you've got to fake it like a photographer.
The scene is demonstrating their "lightpath vectorization". If the claims work out, the real gains are the better use of the full hardware capabilities by vectorizing multiple rays without GPU/SIMD branch divergence - the divergence happens happens when different rays intersect different objects and dramatically slow down parallel work. That should really speed up rendering so allow more rays and create less noise, more detail.
It is a pretty common test scene, and one I've used myself. It originally comes from https://blendswap.com/blend/13552, and there are versions of it adapted into a test scene at https://benedikt-bitterli.me/resources/ and https://casual-effects.com/data/.
What's bad about it? The only thing I can see to criticize is the fact that the 'rays' cast shadows onto the floor.