
I implemented fast parallel reduction on the GPU with WebGL - erkaman
https://mikolalysenko.github.io/regl/www/gallery/reduction.js.html
======
erkaman
Hi guys. For the purpose of demonstrating the GPGPU capabilities of the WebGL
framework
regl([https://github.com/mikolalysenko/regl](https://github.com/mikolalysenko/regl)),
I implemented this parallel reduction algorithm on the GPU. Even if I run it
on my silly integrated graphics card, the GPU is like four times faster than
the CPU.

Yet the implementation is just a simple full-screen shader that is run for a
couple of passes. You can see the source code here:
[https://github.com/mikolalysenko/regl/blob/gh-
pages/example/...](https://github.com/mikolalysenko/regl/blob/gh-
pages/example/reduction.js)

In case you are not familiar with parallel reduction, I will explain it here:
given some elements x0, x1, x2,..., and a binary operator 'op', the parallel
reduction becomes 'op(x0, op(x1, op(x2,...) ))'. For example, given the
elements 4, 2, 4, 1, and the operator '+', the parallel reduction will be 11,
which is just the sum of the elements.

So parallel reduction can for instance be used to compute the maximum, sum, or
minimum of a list of elements. It is a very important component for many
parallel algorithms. These kinds of parallel algorithms on the GPU is what
makes Google's TensorFlow library so fast.

