

Mandelbrot Set with SIMD Intrinsics - kilimchoi
http://nullprogram.com/blog/2015/07/10/

======
w0utert
5.20ms per image to render a Mandelbrot set at 1440x800 using AVX. My mind
boggles when I think back to the time (a little over 20 years ago) when I
typed in a Mandelbrot program from a magazine on my Commodore 128, ran it,
went to school, to return 8 hours later to just see it complete the last few
pixels of a monochrome Mandelbrot at a glorious 320x200 resolution ;-)

------
drxzcl
Just in case you are tired of old fashioned fixed function CPUs, check out
this awesome realtime full HD FPGA renderer!

[http://hamsterworks.co.nz/mediawiki/index.php/Mandelbrot_NG_...](http://hamsterworks.co.nz/mediawiki/index.php/Mandelbrot_NG_1080i)

------
panic
The standard visualization for the Mandelbrot set (which is used here) colors
pixels according to the number of iterations it takes for the corresponding
point to get far enough away from zero that it can no longer be part of the
set. It's worth noting that coloring pixels according to the estimated
distance to the fractal instead (using a technique like
[http://iquilezles.org/www/articles/distancefractals/distance...](http://iquilezles.org/www/articles/distancefractals/distancefractals.htm))
can produce a more detailed image: pixels very close to a thin edge of the
fractal are included in a distance-estimation visualization but can be missed
by an escape-time coloring.

~~~
anon4
A quick, very pedantic note - the images don't show colouring based on linear
distance. The problem is that your monitor works in non-linear colourspace -
sRGB, while the image is generated in linear RGB but not converted to sRGB
afterwards. If you want to show proper colours, you need to convert the final
value from RGB to sRGB. For people writing shaders, here are two handy
functions I keep around:

    
    
        vec3 srgb_from_rgb(vec3 rgb) {
            vec3 a = vec3(0.055, 0.055, 0.055);
            vec3 ap1 = vec3(1.0, 1.0, 1.0) + a;
            vec3 g = vec3(2.4, 2.4, 2.4);
            vec3 ginv = 1.0 / g;
            vec3 select = step(vec3(0.0031308, 0.0031308, 0.0031308), rgb);
            vec3 lo = rgb * 12.92;
            vec3 hi = ap1 * pow(rgb, ginv) - a;
            return mix(lo, hi, select);
        }
    
        vec3 rgb_from_srgb(vec3 srgb) {
            vec3 a = vec3(0.055, 0.055, 0.055);
            vec3 ap1 = vec3(1.0, 1.0, 1.0) + a;
            vec3 g = vec3(2.4, 2.4, 2.4);
            vec3 select = step(vec3(0.04045, 0.04045, 0.04045), srgb);
            vec3 lo = srgb / 12.92;
            vec3 hi = pow((srgb + a) / ap1, g);
            return mix(lo, hi, select);
        }
    

They don't have any measurable performance impact, as far as I can tell.

Edit: comparison between non-colourspace-corrected and colourspace-corrected
versions of the shadertoy link
[http://www.screenshotcomparison.com/comparison/134695](http://www.screenshotcomparison.com/comparison/134695)

~~~
panic
It's not pedantic -- interpolation in linear space makes a major difference!
It doesn't look like this shader is meant to show distance directly, though:
it's taking a fourth root and some other things:

    
    
        // do some soft coloring based on distance
        d = clamp( 8.0*d/zoo, 0.0, 1.0 );
        d = pow( d, 0.25 );
        vec3 col = vec3( d );

~~~
anon4
I'm calling it pedantic, because as far as I can see nobody cares outside of
people doing movie VFX and the like, where it is actually really really
important that you get it right. Your GUI for example doesn't care - elements
are blended assuming a linear colourspace, which means that all font rendering
is just slightly off. Even Apple don't care to make their blur effects correct
- see
[https://www.youtube.com/watch?v=LKnqECcg6Gw](https://www.youtube.com/watch?v=LKnqECcg6Gw)
. Even photoshop doesn't do it correctly by default. Almost everybody is
perfectly happy treating the values as linear colour intensity.

The bottom line is that sRGB is hideously hard to work in. You really want to
only use it for storage and only if you really must. It's an optimisation to
allow you better range on the low end, at the expense of the high intensity
end, to match human colour vision. However, that means using 48bit RGB and
that's not a price everyone wants to pay. People who do professional graphics
work simply use 32bit floats per colour channel - 128bit RGBA, or even 192bit
RGBRaGaBa (separate alpha for each channel) and have workstations with 32 or
64GB of RAM. However, your normal everyday GUI application needs to run on a
phone with about 1GB of RAM. Or going back farther, it has to run on an Intel
486 with 16MB of RAM. That kind of explains where the culture of "just treat
it as linear, it's not that wrong" came from :) .

------
melling
I noticed the plug for [https://handmadehero.org](https://handmadehero.org)

How's that going? Seems like progress has slowed.

~~~
RoboSeldon
Casey has took a week off for a medical urgency in his family.

If you want to see how alive is Hand Made Hero just follow Casey on Twitter
and check the Forums from handmadehero.org.

------
krylon
> I didn't use C99's complex number support because -- continuing to follow
> the approach Handmade Hero -- I intended to port this code directly into
> SIMD intrinsics.

Which makes me wonder - do GCC and/or Clang compile complex arithmetic so SIMD
instructions?

A couple of years back, I rewrote a mandelbrot renderer from C89 to C99 and
used complex numbers, and I noticed that it ran faster afterwards (not
dramatically so, but I noticed the difference). I never checked, though, if
the compiler emitted SIMD instructions, and sometime since I lost the source
code to a faulty hard disk and was too lazy to write another one.

------
Narishma
I believe NEON was introduced with ARMv7, not ARMv6 as claimed in the article.
Though some Intel ARMv5 CPUs (when they were still making them) did support a
SIMD extension called WMMX, which was based on MMX and/or SSE.

