Hacker News new | past | comments | ask | show | jobs | submit login
WebAssembly Raytracer (sniklaus.com)
215 points by omn1 on Dec 16, 2017 | hide | past | web | favorite | 84 comments

The JS version is slow in Firefox because of a silly perf cliff, I filed a bug [0]. A quick fix for that improves the JS version from ~400 ms to ~180 ms for me locally. ~140 ms if I disable the Nightly-only GC poisoning.

edit: the JS code is a bit buggy: it does |if (Math.abs(dblDiscriminant) < 0.01)| in the |intersection| function, but dblDiscriminant is a var set in the next loop! Math.abs(undefined) is something we can easily optimize in the JIT, but really the JS code is broken. Folding this in our JIT [1] improves this to < 130 ms for me.

Please let us know if you run into JS perf problems, these bugs are usually easy to fix.

[0] https://bugzilla.mozilla.org/show_bug.cgi?id=1425687

[1] https://pastebin.com/CzfP5vQj

Thank you for your evaluation! It should have been dblDenominator instead of dblDiscriminant and I fixed it accordingly. However, this change does not seem to significantly affect performance.

I made a Path-tracer (photorealistic algorithm) in pure JS four years ago: http://renderer.ivank.net/

It rendered this image: http://renderer.ivank.net/balls.jpg :)

I also made the "full" 3D game: http://powerstones.ivank.net/ (uses a GLSL shader).

The 3D game doesn't work for me, when I hit play all I get is a white screen with light pink/teal/yellow coloured pixels/blocks flickering at the top of the screen. I'm on an intel integrated graphics.

Worked fine for me in Windows 10, FF57, using Intel integrated graphics. Things were noisy-looking, and I'm not sure of that's just part of the game aesthetic, or what.

It's how Ray traced images look until you give then long enough to smooth out.

That is beautiful, congrats! Advanced Global Illumination is a great book if one is interested in building one.

What's the performance like when compares to WebAssembly? Would that be difficult to determine?

There was no WASM when I made it in 2013. But I guess WASM would be like 10 - 15 % faster.

Firefox really benefits from WebASM: the time per frame is ~90ms on my machine whereas the JS version is close to 400ms or so.

On the other hand, on Chrome the WebASM performance is ~100ms and JS is just a tiny bit slower at around 120ms!

That's a really interesting observation on so many levels!

Firefox (or rather SpiderMonkey) seems to really be focusing on WebASM performance, and it's easy to forget just how goddamn fast V8 has made JavaScript...

It's worth noting that this isn't idiomatic Javascript though, the author is jumping through hoops to avoid ever triggering the garbage collector.

Javascript can be very fast if you go down that route but the code gets extremely ugly.

Isn't most "fast path" code in a GCd language written like this?

I've jumped through similar hoops in Go, and even C++ to avoid allocation or GC.

The fact that a dynamic language like JavaScript can compile and execute as fast as it does is nothing short of incredible! Even if it's not idiomatic, it's still far from "unreadable".

Not if the language supports a mix of value types, off heap allocations or untraced references.

There are a few like that.

Can you name them?

Yes, surely.

- Eiffel

- Oberon family (Oberon, Oberon-2, Active Oberon, Component Pascal, Oberon-07)

- Mesa/Cedar

- Modula-3

- D

- Nim

- C# (specially now with 7.2 improvements)

- Go

- Swift (RC is also a form of GC algorithm)

These are just the most well known ones, there are others if you feel like having a dive into SIGPLAN papers.

I can also provide examples, how to do it in any of them, if you wish.

What do you consider jumping through hoops? The code looked pretty normal to me.

I initially wrote the raytracer in more idiomatic way. However, I wanted to show my students a side-by-side comparison, demonstrating that the program logic is pretty much the same. Take it as you will, I understand your critique though.

Idiomatic will probably entail a lot of objects for the pixel data, I'm assuming? I've seen some raytracers written with arrays of simple "bags" of numbers, which would increase performance.

I wrote a real-time 3D polygonal rasterizer in JS that supports texture mapping. After switching from regular arrays to Float32array to store coordinates, it got a huge (~50%) jump in performance.

It really depends what you mean by "very fast". For most event driven UI applications JavaScript can certainly be more than fast enough, even on relatively underpowered devices (say, iPhone 5S).

With that being said, I recently ported a bunch of data heavy JS calculation code that had previously been running in the browser (and downloading about 100MB of data to the browser to do so - ouch) to run on a .NET back-end.

(Ignore the fact that doing this on the client will sound crazy to a professional software engineer but I would argue is totally legitimate in the context of a smart person who is not a software engineer building a prototype with tools they understand, which is what happened.)

Now, in doing the port I did improve and optimise the way the calculations were performed: basically I eliminated some redundant operations and structured the data up-front in a way that made it easier to work with. This in itself only sped things up by a factor of 2 or 3.

What really gave me the performance boost was taking advantage of the strong typing that languages such as C# offer. Just switching over to using strongly typed dictionaries via generics, and strongly typed arrays, having initially used IDictionary<string, object> (this, because I just wanted to get it working to start off with, and wasn't sure I'd always be working with the same type of data), gave me an extra 75 to 100 times performance boost.

In the end the C# version outperformed the JS version by a factor of around 200, mostly due to being able to take advantage of strong typing. And here I'm comparing the classic .NET 4.6 CLR to the latest version of Google Chrome, both running on the same machine.

I'm not JavaScript bashing - I use it all the time, and it's good for what it's good for - and V8 is a fantastic runtime, but due to the constraints of the language it simply does not approach the performance of strongly typed languages. I believe I would be able to do even better than the above with C++. Due to the parallelizable nature of the calculations we are considering a switch to CUDA to boost performance further.

If you are running performance-critical CPU-intensive code, unless you have to run that in the browser, JavaScript is generally not going to be the right choice. Based on the figures I see for this renderer, neither is WebAssembly.

If you do need to run that sort of code in the browser, and you want consistent performance across browsers, then WebAssembly might be[1].

[1] As long as you don't need to support IE11. If you do need to support it I'd suggest you find a way to avoid running large amounts of JS since, whilst its performance is definitely the best of any version of Internet Explorer, it's still not what you'd call fast and doesn't do a great job of memory management.

This really sounds strange. Can you put the code up somewhere? There shouldn't be a 75 times improvement by going to .NET. A 50% improvement is more likely.

That is such a picked from the air number.

nice comment! how does enforcing type system help improve the performance? is there any article that sheds more light? also, did you mean static typing instead of strong?

Static typing means that typechecking can be done at compile-time, and optimizations can be applied then. Dynamic typing generally means that some runtime layer needs to do typechecking during execution. So, as a rule of thumb, it's usually assumed that a statically-typed language will be faster than a dynamically-typed one.

... unless the dynamically typed language also allows type declarations, intended mainly for enhancing speed.

"Dynamic" or "static" isn't a binary value. A dynamic language that allows some static typing is less dynamic than a language that doesn't allow that.

Btw. similar numbers are available on the projects github page: https://github.com/sniklaus/experiment-raytracer

Alternatively, it's pretty amazing that a nearly pure FPU task with small data that should live well in cache is only about 5x slower when written in straightforward JavaScript.

This is the kind of code that in previous generations would have expected to see a 20-100x penalty in perl or python.

As previous threads have already mentioned: that is not straightforward JavaScript. There’s nothing impressive about stripping down a scripting language by making it’s structure exactly the same as c in the end.

What’s the point of JavaScript in the first place? What we had done is we took a microscope and then made sure it’s made of toughened glass and the strengthened steel so that we could nail a few nails with it.

But would adding antialiasing kill the performance? It’s a small thing, but it’s kind of like a movie where you can totally suspend disbelief and one bad actor or effect jolts you out it.

Well, sure, but so would rendering at a more reasonable resolution, or a more realistic scene than five spheres on a plane. And it's not like 10fps is a useful frame rate in the first place. But it's the relative, not absolute, performance figures that are interesting here.

> And it's not like 10fps is a useful frame rate in the first place.

How times have changed. I remember running flight sims back in the 16-bit era (Amiga 500, for example) when 10fps would have been considered pretty good, especially if there was a lot going on.

But yes, I think nowadays you're right: 60fps or bust. And this is really why ray-tracing isn't used that much in games - it's just so computationally expensive.

Getting 60ms on Safari on macOS for WebASM.

The other three don't seem to work though.

The Shader version doesn't work for me in Safari 11.0.1 but the other three did work. I found that I couldn't run any of the others after I had tried the Shader version. I reloaded the page and then I could select between the JavaScript, asm.js, and WebAssembly versions without issue.

The Shader version needs WebGL 2.0, and works with the latest Safari Technology Preview

Perhaps it was written for Chrome specifically (?)

Firefox Quantum?

The first phase of Quantum made styling and layout faster than any browser. The next phase will make rendering faster than anything else. The JS engine is still the old one. (Maybe it's time will come third?)

Mozilla people have been working on SpiderMonkey performance already: https://jandemooij.nl/blog/2017/12/06/some-spidermonkey-opti..., it's just not some wholesale replacement, so it probably never makes the top of the list for what they shout about. It also sounds like it has been a steady progression rather than a big jump, so maybe not quite as noticeable.

Yep, v57.

On my desktop:

Firefox: JavaScript ~372ms, asm ~121ms, WebAssembly ~91ms, Shader ~0.1ms

Chrome: JavaScript ~115ms, asm ~150ms, WebAssembly ~105ms, Shader ~0.1ms

Edge: JavaScript ~3340ms , asm ~(230-240)ms, WebAssembly ~(160-180)ms, Shader: Broken

IE11: Broken.

Interesting values.....

Doesn't work for me at all in Edge (v41.16299.15.0). Are you on an Insider Preview?

Might depend on your device, the shader mode worked fine on my nvidia shield, but is broken on my nexus 5x.

I unfortunately had a bug in my code that I just fixed. It should work now, unless you encounter the bug that has been discussed here: https://github.com/mrdoob/three.js/issues/9716

Interesting that there is a 1000x speed difference (0.1 vs 100 ms, in Chrome) between the shader and the JS/WebASM versions even with just Intel integrated graphics.

Even after working with them for a while, GPUs still feel like cheating. They are just so goddamn good at what they do.

That kind of speed up isn't just a fluke here, it's pretty normal when your work aligns with what a GPU does best.

They're really fast even if you're sort of abusing them. Using fragment shaders to render entire scenes ("two triangle" style) probably wasn't what GPUs were built to do but the technique is awfully effective.

I mean: https://www.shadertoy.com/view/Mt2yzK

I've been working with GPUs for many years now and I still get surprised by their performance: some things turn out to be so bloody fast that I initially think that i must do something wrong. Other times, CPU implementations run circles around the GPU for no obvious reason. It's not intuitive and I still have a hard time wrapping my head around how fast these things are when they are at their best.

It is a complete different way of working, but sometimes it just feels like going back to Amiga coding with Blitter and DMA.

It was would interesting to compare a C or Assembly version against a shader version. My hunch is that this is just a use case where a general purpose CPU is left in the dust by a modern GPU.

A shader is just taking advantage of the GPU parallelism. Would still wreck C++ implementation.

the emscripten.cpp is pretty much c?

oh you mean outside of the browser. yes that would be interesting

Yep. That's what I meant.

The GPU version is much faster, but I doubt it's that much. I suspect the actual rendering time is not being measured because it happens asynchronously.

Still though, the potential for using GPUs to accelerate computations is high and I think too often overlooked by web developers, even those doing image processing.

Is there a resource to learn WebAssembly?

I don't mean how to use a tool chain to compile higher level languages to WASM. I mean how to write the linear assembly bytecode as seen here on Wikipedia:


I went straight to the spec personally; it’s quite readable and shorter than I expected.

Where is it?

No linear, but these ones are quite good for the Lisp-like syntax, which I prefer.



I think these won't let you write WASM. Am I wrong?

Sorry I did not read it properly, you can use these ones instead to actually write WASM.



you are right I asked them about adding the option to compile WASM or WAST (instead of c), but the answer was that 'there are no use cases'. of cause 'learning' would be just one.

the shader version is the fastest, but looks like modern art on my smartphone.

Like someone scribbled the colors over the circles instead of filling them.

This is probably the culprit: https://github.com/sniklaus/experiment-raytracer/blob/master...

It declares that floats should only be medium precision by default, but those precision specifiers are only followed on mobile platforms.

Desktop WebGL implementations just promote everything to high precision so that kind of bug is invisible until you test on mobile.

Made a screenshot: https://imgur.com/a/7J4Da

Thank you for pointing it out, I fixed it by now. :)

Fixed for me. Nice work

That's probably an implementation issues that manifests on your hardware called shadow acne/self intersection. It's explained here[0]

0: https://www.scratchapixel.com/lessons/3d-basic-rendering/int...

Made a screenshot: https://imgur.com/a/7J4Da

When WASM only give's a 10% increase in performance, is it really worth it ? While at the same time using "hardware" shaders gives 1000% increase in performance. If the goal of WASM is not performance, what is the reasoning behind it ? That you would be able to compile any!? program to be run in the browser !?

This might not be best way measure wasm performance. WebAssembly can be 2-5x fater than JavaScript.

Not to forget wasm is in very early stage, once we get threading support(https://github.com/WebAssembly/threads) we can get much better results.

Wow, Chrome is able to run the Shader on my Intel integrated graphics on Wayland in Ubuntu 17.10

Colo(u)r me impressed.

I'm getting 0.1 versus 125 for WebAsm versus ASM.js/JS of ~210 – kind of crazy that Chrome can run plain JS as fast as ASM.js (or is it?)

Firefox (57) can't run Shader unless there's some under the hood option I have to turn on …

I'm using Firefox 58 beta on macOS with Intel graphics and the Shader version works for me. Maybe WebGL is disabled in your set up. Some drivers and driver versions are blocklisted to avoid instability:


The graphics section in about:support might tell you your WebGL status.

A little beside the point, but I love the real-time distribution visualisation with the boxplot.

Thank you for your kind words. I borrowed the idea from a previous project in which curious students could evaluate the computational performance of their various implementation using a similar plot: https://github.com/sniklaus/teaching-minichess

a raytracer in JS: ~407.3 ms and in WASM: ~87.4 ms and in GLSL: ~0.17 ms


This is where I answered most of the questions. I didn't even see that someone posted it here as well. Thanks!

I had no idea ray tracers can be written with such concise and clean code. the results are amazing too. lesson learnt: JavaScript is extremely fast on chrome, 6 times slower than WASM in Firefox and GPU shaders are still faster by a factor of 1000!!

If you give up having clean code, you can even fit a raytracer on a business card: http://fabiensanglard.net/rayTracing_back_of_business_card/


JavaScript 106.8

asm.js 140.0

WebAssembly 88.4

Shader 0.1

Just a Dell 13 inspiron laptop with latest Chrome 63.

Seeing the same with Chromium on my Linux machine. In Firefox the JS version is really slow, as you would expect.

It's kind of hard to judge this without having a decent enough look at the code, or how it is being benchmarked. For example, asm.js requires quite a bit of set-up to intialise; if it's restarted from scratch every loop for benchmarking purposes that would mess up the timing.

Shader 0.2 - Samsung galaxy s8

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact