Deciphering the Business Card Raytracer

a_e_k · on Sept 22, 2013

Hi there, code author here. I've been lurking here for years, but couldn't resist an invitation like this.

I've always enjoyed Fabien's analyses so it's great fun to see what he makes of my own code. Anyway, a few clarifications from what I remember of this. (I wrote this '09, and I don't have my old notes in front of me right now):

* The `n` value returned by the trace function, `T()`, is the surface normal. It returns this whether it hit the plane or a sphere.

* The `r` vector in `S()` is the reflection vector off whatever was hit.

* The mystery `c` point in the `main()` function is the offset from the eye point (ignoring the lens perturbation `t`) to the corner of the focal plane. Note below that we're nominally tracing rays for pixel (x,y) in the direction of ax+by+c. At the image midpoint, this should be `g`.

* "although I suspect this to be a side effect: This is not how soft-shadows are done." True. There are true soft shadows in this, but this isn't where they're computed. That's the job of the randomization where `l` is computed in `S()`.

Anyway, please feel free to ask any questions about it.

fabien_sanglard · on Sept 22, 2013

Thanks Andrew, that 'c' variable was giving me a headache ;) ! I have updated the article with your remarks.

BTW, I have listed a few good resources about raytracing at the bottom of the page: Do you know any other good books/websites ?

a_e_k · on Sept 22, 2013

You're welcome!

Scratchapixel was new to me, so thanks for that link. PBRT (which you already have) is definitely a favorite. You also cited Graphics Gems IV but I enjoy dipping into all of them. Other suggestions:

* http://tog.acm.org/resources/RTNews/html/ (Somewhat dead, but the archives are great)

* http://ompf2.com/ (Active, lots of ray tracing specific discussion)

* http://www.realtimerendering.com/ (I'm fond of the book as well)

* http://kesen.realtimerendering.com/ (Especially the sections for "Symposium on Interactive Ray Tracing" and it's successor, "High Performance Graphics")

If I had to pick one, though, it'd be PBRT since it's a unified work that starts from the ground up.

antirez · on Sept 22, 2013

Some time ago I wrote a raytracing program in javascript that is supposed to be very simple to understand: https://github.com/antirez/jsrt/blob/master/rt.html

Demo here: http://antirez.com/misc/rt.html

Uchikoma · on Sept 23, 2013

Nice!

We did a real time raytracer with 2 spheres where you could move around one with C and a cluster of DEC Alphas 20 years ago, funny to see this on my laptop in Javascript (!) now.

fabien_sanglard · on Sept 22, 2013

How come you decided for a coordinate system with Z pointing up instead of the traditional right hand/left hands coordinate systems ?

a_e_k · on Sept 22, 2013

Actually, Z-up coordinate systems can also be called right or left handed depending on which way Y points. :-)

But really, it was just personal preference. Way back when I was learning 3D graphics, I found it easiest to visualize Z as elevation above the XY plane. I'd also used some software that followed that convention (like trueSpace and later 3D Studio Max).

These days, I mainly work with software that tends to favor a Y-up convention by default.

mistercow · on Sept 22, 2013

In non-programming math contexts, it's pretty standard for Z to point up. I have no idea why, as it seems to me like the typical way it works in graphics programming is the more obvious extension of 2D plotting into 3D.

Sharlin · on Sept 22, 2013

As far as I know, in the CAD world Z is typically up/down. It makes sense if you look at the scene from above, like an architect a floor plan - world-space X and Y map to screen-space X and Y, and Z is then the depth. In a 3D virtual world, on the other hand, the perspective is usually of a person embedded in the world, in which case "Y is up" is more intuitive.

mistercow · on Sept 22, 2013

OK, thinking about the architecture angle finally made it click for me.

LeafStorm · on Sept 22, 2013

When I took Calculus III, Z was up. When I took Engineering Physics the same semester, Y was up.

According to legend, this is why the Math and Physics departments hate each other.

bulte-rs · on Sept 23, 2013

So the difference between Math and Physics can be described as a single rotation matrix?

carbonik · on Sept 27, 2013

I thought it was coming from the fact Nobel's wife had an affair with a mathematician. Thus explaining why there is no Nobel Price for Mathematics.

dagw · on Sept 27, 2013

There are several reasons to believe that that urban legend isn't true. One of them being that Nobel was never married.

mynameisfiber · on Sept 22, 2013

With the right rotation, any direction can be pointing "up".

Luc · on Sept 22, 2013

It makes sense if you think of the X and Y axis as being drawn on paper, which usually lies flat on the desk.

mistercow · on Sept 22, 2013

Sort of, kind of, but it's a stretch. "Up and down" on a page are not oriented according to gravity, but according to the top and bottom of the page. If you say "Well, we'll make the z-axis vertical because vertical is perpendicular to the plane of the desk, and that's depth", and then you put the z-axis toward the top of the page, you're using two completely different definitions of "vertical" in the same sentence.

qu4z-2 · on Sept 23, 2013

I always use Z for up/down even though I'm a programmer. For some reason I just find it easier to reason about. In fairness, my first forays into 3D were extending a 2D top-down game with an "altitude".

pux0r3 · on Sept 24, 2013

I'm a game programmer, and I have an answer for my preference of Z being up: I want positive values going into the screen (away from the viewer) but I also want a right handed coordinate system (mostly because physics become more intuitive - especially wrt torque).

X is almost always to the right, so this gives me Y and Z to work with. To satisfy both the positive values into the screen and right handedness I need either Z forward and Y down or Y forward and Z up. Since most of the time you draw a graph with an axis going up rather than down, that settles it.

It's entirely arbitrary in the end, a single matrix will fix any dispute you have amongst peers.

raminf · on Sept 22, 2013

#define op operator

#define rt return

Then replacing the corresponding inline tokens shaves off another 25 bytes from the source without affecting the output or (arguably) the readability.

Sorry, my inner obfuscator couldn't resist :-)

Beautiful bit of coding, btw.

a_e_k · on Sept 22, 2013

Thanks, glad you liked it!

Yes, it's true that even more could have been shaved off and I did explore some #defines. As I recall, there were also some other changes I could have made to shave off yet more. Ultimately, I decided against them for two main reasons. First, they'd need to occupy lines by themselves which I felt created a bigger gap at the top that ruined the aesthetics somewhat. I wanted something that would look nice and roughly fill the 1.75 aspect ratio of a typical US business card. Secondly, as noted in my reply to matthew-wegner below, I'd noticed the file size that I had achieved.

zem · on Sept 22, 2013

#defines have to be one to a line, though - in this case, where you're optimising for layout rather than bytes, i'm not sure it would be a net win. counting, there are 5 instances of "operator", so that's a saving of 36 chars versus 40 taken up by the #define (including the whitespace at the end). there are 10 "return"s, so saving 40 chars there just about draws you even.

phySi0 · on Sept 22, 2013

>Then replacing the corresponding inline tokens shaves off another 25 bytes from the source

I dunno, I kinda like the current size.

fabien_sanglard · on Sept 22, 2013

Can you pastebin.com it so we can see what it looks like ?

raminf · on Sept 22, 2013

http://pastebin.com/ktukPBHE

I reformatted it so it would justify right like the original but for some reason Pastebin likes to reflow things.

It turns out the single-character tokens 'u' and 'w' were not being used so repurposing them for the #defines leads to a saving of 46 characters at the cost of an extra line. There are a few other recurring tokens but you hit a point of diminishing return.

One should not, however, lose sight of what an awesomely clever hack the whole thing is. Reminds me of some of the text-flow layouts in old illuminated manuscripts, or the more modern variations here (like the one with the light-bulb): http://www.smashingmagazine.com/2008/02/11/award-winning-new...

vldr · on Sept 25, 2013

As a (poor :) 3d/raytrace excercise, I've translated the code to python: http://vanderwijk.info/blog/business-card-raytracer-in-pytho...

chrissnell · on Sept 22, 2013

If you want to make your own, like Fabian Sanglard did, take this block of binary numbers and draw your initials (left-justified) with 1's:

  0001110000010001110
  0010000000000010000
  0100000000010010000
  0100000000010001100
  0100000000010000100
  0010000010010000010
  0001110001100011100

That's my initials, cjs. If you unfocus your eyes, you can see the letters pretty clearly.

Take each line, top to bottom, and convert to decimal:

http://www.mathsisfun.com/binary-decimal-hexadecimal-convert...

Edit Paul Heckbert's code (lines 12-13, the "G" array), replacing the numbers from right to left with the decimal values. In other words, the last number in the array is the top line of the binary block above.

Clean up the justification and you have your business card raytracer, courtesy of Mr. Heckbert.

I probably could have coded this in the time that it took me to draw my initials in binary pixels.

Also, here's a version with the camera moved slightly farther out so that the letters don't clip:

https://gist.github.com/chrissnell/6656963

logicallee · on Sept 22, 2013

thanks.

what exactly were you referring to with "I probably could have coded this in the time that it took me to draw my initials in binary pixels."?

chrissnell · on Sept 22, 2013

I could have coded something to generate customized versions of the original source, given three ASCII initials.

logicallee · on Sept 22, 2013

Come on, Chris. I'm a big fan of the humblebrag, but you would still have to draw the binary pixel version of each letter (because C doesn't come with binary pixel array versions of the 26 letters) - so it would be 8 times as much work at an absolute minimum (26 characters versus the 3 that you did), even if the code to embed them in the array were a completely obvious 1-liner, and so was the code to clean up the justification.

Let's not flatter ourselves :)

chrissnell · on Sept 22, 2013

Didn't mean to come off as arrogant but there are existing tools that would make this a lot easier. figlet(1) would make the pixel conversion pretty easy. Pixels to binary really isn't that hard. You could do it with a regex.

derekp7 · on Sept 22, 2013

Another way is to use the netpbm toolkit. First, use its ppmdraw command to rasterize each character to a canvas, then convert that canvas to xpm format (you have a separate canvas / picture for each character). The xpm format has the nice property that it is also a valid C structure. Real easy to then convert to whatever you want.

Oh, and the netpbm toolkit has not only stand alone binaries, but it also has a C library for each of the conversions that it includes. Seems to be an older under-rated toolkit.

skrebbel · on Sept 22, 2013

Render a Small Fonts text to a text-based monochrome image format with e.g. Imagemagick, read file, and parse each line as binary. Reverse array and separate by commas.

jevinskie · on Sept 22, 2013

Would the wasted empty space in a #define return <pick a single character> make up for the several return statements that I see? That is why typedef statements are nice, you can ; terminate them!

matthew-wegner · on Sept 22, 2013

The original is also 1337 bytes, which might explain any padding!

a_e_k · on Sept 22, 2013

Guilty as charged!

I stopped trying to minimize it when I noticed that.

Rangi42 · on Sept 22, 2013

The original ended on line 35, column 28. By #defining return as u, it ends at line 35, column 19. Although I had to split the number 231184 across two lines (2311\84) to avoid a gap at the end of the line.

http://pastebin.com/6rM895SC

chii · on Sept 23, 2013

It looks like his site is taken down due to bandwidth limit reached. Use http://webcache.googleusercontent.com/search?q=cache:http://... till it's back up.

xioxox · on Sept 22, 2013

I like this extremely short postscript raytracer I saw quite a few years ago:

http://www.is.titech.ac.jp/~sadayosi/lab/h-takasi/ops.html

  %!IOPSC-1993 %%Creator: HAYAKAWA Takashi <h-takasi@is.titech.ac.jp>
  /C/neg/d/mul/R/rlineto/E/exp/H{{cvx def}repeat}def/T/dup/g/gt/r/roll/J/ifelse 8
  H/A/copy(z&v4QX&93r9AxYQOZomQalxS2w!!O&vMYa43d6r93rMYvx2dca!D&cjSnjSnjjS3o!v&6A
  X&55SAxM1CD7AjYxTTd62rmxCnTdSST0g&12wECST!&!J0g&D1!&xM0!J0g!l&544dC2Ac96ra!m&3A
  F&&vGoGSnCT0g&wDmlvGoS8wpn6wpS2wTCpS1Sd7ov7Uk7o4Qkdw!&Mvlx1S7oZES3w!J!J!Q&7185d
  Z&lx1CS9d9nE4!k&X&MY7!&1!J!x&jdnjdS3odS!N&mmx1C2wEc!G&150Nx4!n&2o!j&43r!U&0777d
  ]&2AY2A776ddT4oS3oSnMVC00VV0RRR45E42063rNz&v7UX&UOzF!F!J![&44ETCnVn!a&1CDN!Y&0M
  V1c&j2AYdjmMdjjd!o&1r!M){( )T 0 4 3 r put T(/)g{T(9)g{cvn}{cvi}J}{($)g{[}{]}J}J
  cvx}forall/moveto/p/floor/w/div/S/add 29 H[{[{]setgray fill}for Y}for showpage

szc · on Sept 22, 2013

There are two other small ray tracers that can be examined. Both are winners of the International Obfuscated C Code Contest.

Matt Zucker won in 2011 and Anders Gavare won in 2004.

They can be found here, http://www.ioccc.org/years-spoiler.html

(The 22nd IOCCC is open for entries until 2013-Oct-03)

dinkumthinkum · on Sept 22, 2013

Very nice article and straight to the point. It's really nice to see a pithy discussion and impressive demonstration.

ggambetta · on Sept 22, 2013

I love this kind of thing. I've made a similar one (minus texturing) in Javascript, drawing to a Canvas; my current version is 975 bytes. Output, full source and live demo here: http://gabrielgambetta.com/tiny_raytracer.html

milesf · on Sept 22, 2013

The output file is .aek but I can't figure out what to do with it. My Google-fu is failing me this evening.

How do you view the output?

Pitarou · on Sept 22, 2013

If you're using Linux or OSX, run it like this:

  ./aek > aek.ppm

That will produce an image in .ppm format. ('The world's second-dumbest picture format.') You'll probably have to convert it to a more widely supported format. Make sure you have the netpbm package installed, and then run, say:

  pbmtopng aek.ppm > aek.bmp

Sharlin · on Sept 22, 2013

If you have ImageMagick or GraphicsMagick installed, you can just pipe to display(1):

    .aek | display

tnorthcutt · on Sept 22, 2013

The correct command is actually pnmtopng, FWIW.

Pitarou · on Sept 23, 2013

Oops!

ville · on Sept 22, 2013

What is the worlds dumbest picture format then?

Pitarou · on Sept 23, 2013

pbm

    P1
    # feep.pbm
    24 7
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0
    0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0
    0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 1 0
    0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
    0 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Sharlin · on Sept 22, 2013

Raw data without any header, I suppose. You must know or guess the width and height to properly decode the image.

evincarofautumn · on Sept 22, 2013

Guessing the dimensions of an image with n pixels is equivalent to factoring n, so good luck with that! And only if your luck is really good (prime width and height) will there be a unique solution—actually you still don’t know whether to choose portrait or landscape. It gets even more interesting if you don’t know the number of channels or the pixel format.

Sharlin · on Sept 22, 2013

Well, you can check if the image matches common resolutions and aspect ratios - if there are 307200 pixels, for example, a pretty good guess is that it's 640x480 (or 480x640). You'd need a human to see if the image looks correct - actually, with a simple GUI, it wouldn't take a human very long at all to figure out the correct aspect ratio. This could probably be automated easily enough using some pattern recognition - the "shearing" pattern caused by guessing the width wrong is pretty linear and well-behaving.

thibauts · on Sept 23, 2013

Minimize the gradient ...

logicallee · on Sept 22, 2013

No, that's not it at all.

What you just mentioned is actually really easy (if it worked): factoring n where n is up to 10 megapixels.

Factoring up to a few megapixels (e.g. 3648 x 2736) isn't hard at all! At most there are 3159 guesses (square root of number of pixels) before you hit one of the factors - and that's starting at 1, rather than starting at sane minimum aspect ratio.

(It's only when you start trying to factor really large numbers that the problem becomes intractable. For example, the same trick just doesn't apply to a composite number that's 600 digits long. We're talking 6-9 digits, which a tight loop can crunch through faster than a disk read.)

The harder issue is after you do factor it into primes, you still have to combine those primes into a width and height.

For example, 640 x 480 is 307200. Factor that and you get 2^12 * 3 * 5^2.

There's obviously quite a few ways to combine those into two groups (two factors)...

so you're really not done. Probably the best heuristic is to simply count pixels and then compare with common formats.

I suppose you could do your factorization trick, and then try combining the factors in every possible way into two groups, and for any combinations that result in a "sane" aspect ratio, run some kind of vision algorithm to see if there are connected color blobs or something.

it does seem to me that for truly arbitrary pixel counts, the factors help but don't determine a unique choice.

gilgoomesh · on Sept 22, 2013

The output is PPM, not AEK (that's Andrew Kensler's initials). Use PPM as the extension (similar to PGM, PNM and others http://en.wikipedia.org/wiki/Netpbm_format). It's not very common in consumer products (too primitive) but you can open in Gimp, Preview.app, IrfanView, XV and lots of others.

tnorthcutt · on Sept 22, 2013

FWIW I was not able to open it with Preview on OS X 10.8

gilgoomesh · on Sept 22, 2013

Eep, my mistake. You're right: it doesn't work. I thought I used to open PPMs with Preview back in Mac OS 10.3 or something else ancient. I could be mistaken on that.

Oh well. Everyone has Gimp and LibreOffice installed though, right? They both open the file without problems.

lilyball · on Sept 22, 2013

Graphic Converter can open PPMs, and that used to ship with every Mac. That's probably what you used.

ubernostrum · on Sept 22, 2013

Or just install ImageMagick, and do 'convert card.ppm card.png'.

Sharlin · on Sept 22, 2013

If you have IM, just use 'display' to open it directly.

milesf · on Sept 22, 2013

Ah. LibreOffice's Drawing program opened it. Thanks :)

lesterbuck · on Sept 22, 2013

Yes, same problem. Google suggests it is an Adobe After Effects file. Linux "file" reports it as

card.aek: Netpbm PPM "rawbits" image data, size = 512 x 512

Not sure how to view Netpbm (portable bit map).

carterschonwald · on Sept 22, 2013

if you read further, you'll see its supposed to be .ppm, but the article author made a typo :)

thenomad · on Sept 22, 2013

Along the same lines, albeit much less compact - a 99-line Path Tracer (the next evolution from a straight-up raytracer, performing global illumination calculations):

http://www.kevinbeason.com/smallpt/

kid0m4n · on Sept 23, 2013

The same thing, implemented in Go:

https://github.com/kid0m4n/gorays

eaxitect · on Sept 22, 2013

This is really single piece of sh*t I like most for a while... simple and elegant