
The lost art of 3D rendering without shaders - mmphosis
http://machinethink.net/blog/3d-rendering-without-shaders/
======
antirez
The year I learned to write C code I was 19, second year of university and
already willing to drop out, so I started spending time with C and 3D
graphics. I was just fresh of the math exam so the 3D matrix transformations
to do rotations was trivial to implement. I just wrote a function to draw
triangles, used a simple z-sorting technique, and the basic shading
calculating the cosine of the angle between the observer and the surface. With
just these basic things I ended up with 3D "worlds" similar to the ones I saw
in DOS games when I was a child. All the effort was maybe 500 or 1000 lines of
code, but to build things from scratch, only starting from the ability to draw
an RGB pixel, gave me a sense of accomplishment that later shaped everything
else I did. I basically continued for the next 20 years to create things from
scratch.

~~~
stevoski
My first year university Linear Algebra textbook even had an appendix
explaining how to rotate and skew 3D objects in computer graphics using the
matrix multiplication I had learned that semester. I loved it.

Then I finished university and got a programming job creating forms to gather
user data, put it into a database, and generate reports. Sigh.

~~~
antirez
I've the feeling that unfortunately most programming jobs, even in shiny
startups, are more like data forms than 3D engines... That's why many
programmers have OSS side projects where they do cool things.

------
pcwalton
This is a good tutorial, but it's important to note that scanline rasterizers
are _not_ how GPUs (or even high-performance SIMD software implementations)
work. Instead, they use barycentric coordinate sign tests for better
parallelism and "free" interpolation.

A good explanation on this is Fabian Giesen's:
[https://fgiesen.wordpress.com/2013/02/06/the-barycentric-
con...](https://fgiesen.wordpress.com/2013/02/06/the-barycentric-conspirac/)

~~~
exDM69
You're absolutely right about scanline rasterizers and GPUs, but scanline
rasterization is very interesting historically. Most of the 1990s pre-GPU
software renderers did scanline rasterization, for example Quake and Thief had
a pure software renderer.

Before GPUs, perspective correct texture mapping was the holy grail of 3d
graphics because CPUs of the time were not fast enough to do all the divisions
for each pixel, and lots of clever tricks were invented to work around this
limitation. It's a bit of a shame that this article does not cover it, perhaps
there's a part 2 in progress.

Here's a quick write up about Thief's engine
[https://nothings.org/gamedev/thief_rendering.html](https://nothings.org/gamedev/thief_rendering.html)

I've seen a similar piece on Quake but I can't find it now.

------
ChuckMcM
Takes me back. A long time ago I wrote a simple rendering library for the 3DFx
"Glide" library. It didn't do shaders but it would do mipmapped texture
rendering which allowed you to have an image (texture) on your triangle. For a
while I was stuck on the projection matrix and understanding screen clipping
until my Dad gave me his copy of the Kodak Reference Handbook[1] third
edition, copyright 1945. And they describe focal length, field of view,
fstops, and lens effects very clearly.

[1]
[https://books.google.com/books?id=6DgYAQAAMAAJ&dq=Kodak%20Re...](https://books.google.com/books?id=6DgYAQAAMAAJ&dq=Kodak%20Reference%20Handbook)

~~~
planteen
Looks like archive.org has a copy:
[https://archive.org/details/KodakReferenceHandbook](https://archive.org/details/KodakReferenceHandbook)

~~~
ChuckMcM
Awesome, that is the one I have.

------
alkonaut
Nitpick: this is software rendering. This is how we did before any kind of 3D
api existed. Both GL/D3D/etc were without _shaders_ to begin with. I _still_
maintain a fixed pipeline (no explicit vertex or fragment shaders) 3D app with
DirectX.

One can argue that the fixed pipeline of D3D is using a kind of implicit
shader, but it's not the kind of shader we usually mean when we talk about
vertex and fragment shaders today.

------
dahart
> Back in the day — way before we had hardware accelerated 3D graphics cards,
> let alone programmable GPUs — if you wanted to draw a 3D scene you had to do
> all that work yourself. In assembly. On a computer with a 7 MHz processor.

7 MHz? That's so fast and modern. Back in the day we were writing 3d fill
routines on the 6502 going 1 MHz. With no floating point and no diagonal line
support. And in bare machine language, going uphill both directions in the
snow! ;)

~~~
vidarh
One of the first pieces of 6502 assembly I read, and spent ages deciphering,
was an implementation of Bresenham's line algorithm published in some
magazine. Who needs floating point...

------
rl3
> _The framework then takes these shaders and your 3D data, performs some
> magic, ..._

If we juxtapose that statement with the following in an unrelated
introduction[0]:

 _' WebGL is often thought of as a 3D API. People think "I'll use WebGL and
magic I'll get cool 3d". In reality WebGL is just a rasterization engine.'_

I suppose when you're writing a software renderer from scratch without the
luxury of any API or hardware acceleration, such things are indeed magic.

[0] [http://webglfundamentals.org/webgl/lessons/webgl-
fundamental...](http://webglfundamentals.org/webgl/lessons/webgl-
fundamentals.html)

~~~
Retric
When people talk about 'Magic' in software they mostly just mean the
implementation details don't impact them. You can have two very different
GPU's both implement the same WebGL calls correctly.

------
vvanders
Kinda of a shame they omitted matrices. They're one of the foundational bit of
any 3D api and one of the few things that translates well from fixed
function/sw raster to modern pipelines.

Still great to know the fundamentals, texture formats, tiling and other things
are also really useful pieces to understand when working with 3D pipelines.

~~~
Jasper_
Matrices are just a convenient notational trick for a set of linear algebra
expressions. They seem confusing until you realize that an identity matrix
represents:

    
    
        x' = 1*x + 0*y + 0*z
        y' = 0*x + 1*y + 0*z
        z' = 0*x + 0*y + 1*z
    

I prefer to teach 3D graphics without matrices because it's really not
anything more complicated than a compact notation. Tricks like "invert and
transpose to receive the normal matrix" or "take the first column to get your
up vector" make no sense unless you work out the algebra of what those things
mean.

And don't get me started on homogeneous coordinates which are a way to put
translation in a matrix by shoving a convenient "1" constant in the input
vector, and the perspective matrix which does near/far, perspective transform,
and a depth remap in the same matrix, and isn't easily separable because it
steals the "w=1" constant for depth remap and also adjusts the "w" afterwards.
Equivalent of reusing a local variable because you're short on registers :)

~~~
dahart
True, matrices are convenient for notation, and they tend to be confusing at
first. A tutorial like this is better off without the detour through linear
algebra.

But without them you lose the ability to smoosh multiple transforms together,
making it harder to do anything hierarchical or animated. Personally, I'd
place their utility higher than "just" notational convenience.

I feel like we haven't figured out how to teach matrices, that they're
inherently easy after you understand them, but we don't know how to introduce
or explain them easily. Do you introduce them later & do you have any good
resources for them once you do broach the subject with your students?

I like your identity example. I think I really started to "get" matrices after
realizing they are literally vectors put in a stack, and that the vectors
represent the state of transforming from the identity to those vectors. With
that in mind, a matrix can feel much easier than a rotation involving trig
functions and hand-coded dot products. You can do all kinds of rotating and
other transforming without any trig once you see matrices as transforms rather
than an opaque and mysterious brick of numbers. But I admit it took me years
to feel that way after my first encounters with matrices.

~~~
Jasper_
There's nothing magical about matrix multiplication. Smooshing multiple
transforms together just comes down to taking those systems of equations and
combining them. If I have one transform that scales space by 2, and another
that scales translates space +5 (OK, yeah, this is an affine transform, not a
linear one, but I wanted to focus on one coordinate for now), then I have:

    
    
        f(x) = 2*x

and

    
    
        g(x) = x+5
    

Using basic high school algebra we can find out that:

    
    
        f(g(x)) = 2*x+10

and

    
    
        g(f(x)) = 2*x+5
    

This applies all the way down. The magic isn't in numbers in a square shape,
it's in basic algebra. Matrix multiplication comes down to composition of
multiple systems of equations like this, and if you start from that, and
nothing more than the distributive property of multiplication, you can easily
derive "matrix multiplication".

Don't get me wrong. I use matrices in everything "production" for 3D graphics
-- the compact notation is extremely convenient. I just wish we'd stop
ascribing magical properties to matrices like "without them you lose the
ability to smoosh multiple transforms together" because that's clearly false.

~~~
knlje
I would quickly start drifting if I attended a class on 3D graphics and the
teacher started writing everything open without matrices like that. You must
admit the usefulness of a short notation when communicating ideas.

While understanding and talking 3D graphics is certainly possible without
matrix notation, I really see no reason to purposefully omit it in teaching or
other communication. Teaching situation is also a good place to practice
standard notation.

------
air
Minor nitpick

"The green and blue colors, z-position, and normal vector are all interpolated
in the same manner. (Texture coordinates behave slightly differently because
there you’d also need to take the perspective into account.)"

Colors (c), z, and texture coordinates (t) should all be interpolated
differently because of perspective. You need to interpolate 1/z, c/z, t/z and
for every pixel then do division eg. c/z / 1/z = c

------
fizixer
It may be a lost art for game developers. Far from it for CG grad students and
researchers. Quite the contrary, it's actually part of the rite of passage,
heck an undergrad level prerequisite to know these things like the back of
your hand, plus a whole lot more, to do graduate level CG work.

Even if you're not a researcher, but wish to write your own path tracing code
for example, you would end up learning this.

So no, not a lost art at all in my opinion.

------
ykl
This is a great article!

I strongly believe that an understanding of how old school 3D rendering worked
is an excellent thing for modern graphics programmers to have, to appreciate
and understand where all of our fancy modern graphics APIs and whatnot come
from. Back when I helped teach a GPU programming course, one of the
assignments I gave was a full-blown software rasterizer implemented entirely
in CUDA. Not so much "program in OpenGL" as "program an OpenGL". :)

~~~
eriknstr
In a video I saw recently, the guy in it suggested [1] reading old books about
earlier versions of DirectX from the late 90's and early 2000's, around
DirectX version 9, even though one does not want to ever use DirectX for the
reason that most graphics engines are built on the concepts of these versions
of DirectX he said.

[1]:
[https://youtube.com/watch?v=06zp5GMe2rI&t=4m11s](https://youtube.com/watch?v=06zp5GMe2rI&t=4m11s)

------
linuxhansl
Ahh. The days. I remember before I had learned about linear algebra, I saw
somebody rendering molecules as 3D wire frames. I had an Amiga back then with
it's "Blitter" (could draw lines in hardware, a long as you tell it which if
eight octants the line's angle falls into).

Then, being the geek I was, I sat down every day until I had figured out
perspective transformation and rotation (later I found I had just done matrix
multiplication). Of course I never thought of homogeneous coordinates, so
translation was an extra step to be done for each point.

Even worked out "real" red-green 3D. Oh the days when I had time for this
stuff. Fond memories.

------
bhouston
I used to write my own triangle fill algorithms with their own shaders back in
the 1990s. Fun times:
[https://github.com/bhouston/3DMaskDemo1997](https://github.com/bhouston/3DMaskDemo1997)

Here is the optimized triangle fill code with embedded asm pixel shaders:
[https://github.com/bhouston/3DMaskDemo1997/blob/master/src/N...](https://github.com/bhouston/3DMaskDemo1997/blob/master/src/NCCTRI.CPP)

~~~
Radim
Ah, the 80s & early 90s, when you had to implement everything yourself and
every byte and instruction counted :)

The demo scene these days feels somehow less satisfying. The demos definitely
_look_ better, but they can plug into such a vast ecosystem of system
libraries that 64KB feels like cheating.

ASM mode 13h nostalgia.

~~~
EvilTerran
The minimalist scene is still out there, it's just a niche within a niche.

For instance, here's a demo from 2015, done in immediate mode and 256 bytes -
I've got a feeling you'll enjoy it as much as the live audience clearly do:

[https://m.youtube.com/watch?v=07KHwjebf7k](https://m.youtube.com/watch?v=07KHwjebf7k)

For another recent example, also in 2015 some wizards managed to coax 1024
colours (among other impressive visual effects) out of an original IBM PC - as
in, hardware from 1981:

[https://trixter.oldskool.org/2015/04/07/8088-mph-we-break-
al...](https://trixter.oldskool.org/2015/04/07/8088-mph-we-break-all-your-
emulators/)

~~~
wolfgke
> For instance, here's a demo from 2015, done in immediate mode and 256 bytes
> - I've got a feeling you'll enjoy it as much as the live audience clearly
> do:

>
> [https://m.youtube.com/watch?v=07KHwjebf7k](https://m.youtube.com/watch?v=07KHwjebf7k)

Even more cool (but from 2006) and also from digimind:

>
> [http://www.pouet.net/prod.php?which=26655](http://www.pouet.net/prod.php?which=26655)

>
> [https://www.youtube.com/watch?v=E6zhyutgX1k](https://www.youtube.com/watch?v=E6zhyutgX1k)

~~~
tripzilch
Wow to both. Yet to me, Immediate Railways (2015) blew my quite a bit more
than Demoplex (2006) ;)

Maybe I missed a detail, a film cinema room, including chairs, middle aisle,
rotating camera, friggin ambient occlusion AND an animated plasma on the
projector screen. Absolutely jaw-dropping, not a single doubt. I used to write
4096b demos but this is a class of its own, even crazier skills I don't have,
I could have a stab but I'd get stuck at 512b _at least_ , - cough - :-P

But Immediate Railways is the first 256b demo I remember seeing that actually
has multiple parts! Three parts and they actually make sense! You could even
argue that the three parts connect in a (very basic, minimal, three-clause
sentence) story line. Now I haven't been keeping up with everything in the
scene, but that's a first for any 256b demo I have ever seen. Many cases a
256b is showcasing a single "thing", scene or effect that is a few times more
complex than Kolmogorov's worst acid dream. Also I think the design and
colours are nicer than in Demoplex (and yes, that counts).

Fun thing to imagine though, if you'd concatenate 16 of these babies to make a
single 4096b demo, you wouldn't stand a chance in a modern 4k compo :) Not
even allowing 20 of them, sharing init code. The expectation of "way more than
the sum of its parts" has already exploded in the step from 256b to 4096b. I'm
not even sure one could fit a softsynth worthy of the music expected in a
modern 4k into 256b? (ok probably one could, probably someone did, prove me
wrong already :p mine was about 1100b, back in 2000)

------
leeoniya
Related: "JavaScript library for simple 3D graphics and visualisation on a
HTML5 canvas 2D renderer. It does not use WebGL. Works on all HTML5 browsers,
including desktop, iOS and Android."

[http://www.kevs3d.co.uk/dev/phoria/](http://www.kevs3d.co.uk/dev/phoria/)

[https://github.com/kevinroast/phoria.js](https://github.com/kevinroast/phoria.js)

------
jlarocco
A while back I created a small project for drawing 3D wireframe graphics using
the Common Lisp LTK interface to Tk.

It's slow (uses inefficient matrix algorithms, uses Tk, etc.) but it's "fast
enough" for some simple 3D scenes. Not very practical for real-life use, but
it was fun.

[https://github.com/jl2/ltk3d](https://github.com/jl2/ltk3d)

FWIW, it's not doing hidden line removal, IIRC I was careful to pick a viewing
location that made it look good.

------
paulddraper
Great, great stuff. Terrific article.

\---

It does seem to perpetuate -- or at least not make clear -- a misconception.

> 3D rendering without shaders

> We won’t use any 3D APIs at all

Those are two independent statements.

Metal, OpenGL, WebGL, and Vulcan are not a 3D APIs. They are (2D)
rasterization APIs using shaders. Any 3D-ness of the math is external to them.
In contrast, OGRE, Java 3D, and three.js _are_ 3D rendering APIs.

Two independent choices yield four types of ways to do 3D rendering. E.g., in
browser they could be

    
    
                         |         3D API         |       no 3D API        |
          ---------------|------------------------|------------------------|
            GPU shaders  | three.js, using WebGL  |         WebGL          |
          ---------------|------------------------|------------------------|
          no GPU shaders | three.js, using canvas |         canvas         |
          

This article fits in bottom-right corner.

I take notice when I hear the oft-repeated fact that OpenGL/WebGL are 3D
rendering APIs. At www.lucidchart.com, in 2015 we chose to use WebGL when
available to improve rendering performance for (2D) diagramming. Were WebGL
made for 3D stuff, it'd be a weird choice, but WebGL is for high-performance
rasterization of all kinds.

[http://webglfundamentals.org/webgl/lessons/webgl-2d-vs-3d-li...](http://webglfundamentals.org/webgl/lessons/webgl-2d-vs-3d-library.html)

------
buzzier
How OpenGL works: software renderer in 500 lines of code
[https://github.com/ssloy/tinyrenderer/wiki](https://github.com/ssloy/tinyrenderer/wiki)
[https://news.ycombinator.com/item?id=11264469](https://news.ycombinator.com/item?id=11264469)

------
c0ffe
Great article! Reading the title, I thought it was about the "tricks" that
games used when the best thing available was the fixed pipeline.

I still remember how amazed I was when learned the good balance between
performance cost and the resulting image when using textures for static
lighting (lightmaps).

------
a_c
This is the kind of article that I enjoy reading a lot. Most tools available
today mask away fundamental concepts, and many aspiring young engineers learn
to use "tools". While the ability to use various tools is of paramount
importance, the most valuable skills an engineer can possible possess, in my
opinion, is the ability to create new tools/concepts/whatever from 1st-ish
principle

------
hellofunk
I have a question about the rasterization step. When creating the scanlines,
would this be a possible entry point for anti-aliasing, by giving the lines a
subtle gradient that goes to near 0 alpha at the right and left edges? (and
maybe also the top and bottom edges for the lines at the top and bottom of the
stack). There are many ways to do anti-aliasing and this seems like one
possibility to me.

------
Waterluvian
Any suggestions on a good primer for what shaders are and how they work? For
years I've always thought "shaders" are just effects you can layer onto a
rendered scene. Say, to get an 80s effect, or bloom, or a cel shading effect,
etc. I never really thought of it as a way to actually do the base scene
rendering.

------
kgabis
scratchpixel.com has excellent tutorials on computer graphics and how to do
rasterization [1].

[1] [https://www.scratchapixel.com/lessons/3d-basic-
rendering/ras...](https://www.scratchapixel.com/lessons/3d-basic-
rendering/rasterization-practical-implementation)

------
foota
I've thought before that it would be cool to implement a rasterizer in
something like OpenCl.

------
BatFastard
Thank god it is almost lost, interesting, but highly specialized, and ORDERS
of magnitudes slower.

~~~
munchbunny
Other than some very specific things like the rasterization algorithm for
filling in a triangle, you still can't really skip learning the concepts in
the article. The concepts are still quite relevant in the post-fixed-function-
pipeline world, except you do even more math and even more clever tricks.

The GPU and associated API's just nicely abstract specific computations for
you like matrix transforms, texture sampling, depth testing, etc., but the
moment you want to do anything sort of fancy you're right back into the depths
of it.

------
quickben
All these 'for' loops and sequential trigonometric calls...

See, I wouldn't say that art was lost. It was obsoleted to dust by a more
modern and scalable approach.

~~~
pandaman
Nobody really wrote software rendering like that beyond CG classes. I'd think
author is simplifying for the sake of accessibility but it's actually more
complicated than what production renderers did. One could also think it's to
show a GPU's internal work, but, again, GPUs don't do this either.

~~~
jblow
Wrong, wrong, wrong, wrong. Wrong.

People in the video game industry wrote tons of this stuff. We would spend
weeks figuring out how to get one or two instructions out of the rasterizer or
scanline converter, etc. I know this because I was there. I wrote several
software rasterizers, and I learned how to do it by reading papers and
magazine articles written by other people who wrote software rasterizers.

I have no doubt that other industries did so as well.

Even more recently, companies like RAD Game Tools built as products software
rasterizers that are very fast (e.g. Pixomatic).

Also, what's in this article is a simplified introductory take. It is actually
much much more complicated than this. (It doesn't look to me like he is doing
perspective-correct shading, for example.) Also this guy's code is crazy slow
compared to what you'd write in the real world, but hey, it is a tutorial.

~~~
pandaman
Are you sure? You have shipped games with software rasterizers and they did
this exact algo (no matrices in transform, computing gradient for each
scanline)?

~~~
psyc
I also wrote software renderers in the 90s. I would say it was mixed. Some
people used matrices, and some didn't. You really can't say that everybody
used the matrix formalism across the board. For that matter, I worked on an
engine at a well-known game company as recently as 2013, where there were
definitely hand-optimized paths for several different cases of transforms
where the constraints were known. Generally speaking, game programmers will do
whatever it takes, and painstakingly optimizing algos down to minimum
arithmetic operations has always been pretty common in the field.

~~~
pandaman
I too wrote software rasterizers in 90s and even shipped a game with one. If
you used a single sine for each vertix, least computing full rotation, you
were fucked. There was no hardware, which could handle this at real time in
90s.

If you did division for each scanline, you were running at 1/4th of the speed
at most - division was that expensive on 386-Pentium, not to mention other
platforms, where CPU's could not even have hardware div.

> Generally speaking, game programmers will do whatever it takes, and
> painstakingly optimizing algos down to minimum processor operations has
> always been pretty common in the field.

I am not even talking about crazy optimizations (like using intel's x86 half
register to do ghetto fixed point), I am talking about common sense stuff.

~~~
psyc
Ah. I think you're talking about something completely different from what
jblow and I were talking about. If you mean that the code in the article is
not optimized, and is doing crippling amounts of redundant computation, then
of course I agree with you.

~~~
pandaman
The guy asked what kind of magic is calling trig functions in inner loops? I
replied that nobody really did it and I am confused as to what the author was
trying to show.

I, frankly, do not understand what you and jblow read into this other than
what I said.

~~~
jblow
Hahaha, I guess it's a big misunderstanding. You wrote:

"Nobody really wrote software rendering like that beyond CG classes".

I read this as a claim that nobody in general wrote software renderers. When
by "like that" you just meant using the specific techniques he used.

That said, I still have to disagree, in the sense that, to get to a fast
software renderer, you start with a slow software renderer. Nobody does all
the crazy optimizations a priori ... so stuff like a divide per pixel was
common, say. Calling trig functions in inner loops is of course goofy, but my
presumption is that in the next step of refinement those would be lifted out
of the loops, because that is the way things are always done.

~~~
tripzilch
> Nobody does all the crazy optimizations a priori ... so stuff like a divide
> per pixel was common, say. Calling trig functions in inner loops is of
> course goofy, but my presumption is that in the next step of refinement
> those would be lifted out of the loops, because that is the way things are
> always done.

Yeah, .. but no. Depends on what era you're talking about I guess.

When I wrote my first low-level rasterizer + basic 3D poly engine-ish-thing in
1998, I honestly wouldn't have considered for a second to do a division per
scanline. An integer add was one tick, a mul was 3-10 (iirc), but a division
was 10-40 ticks. Depending on what point you start considering the
optimizatons "crazy", yes, had I known (and cared) about perspective correct
shading back then, I would have started designing the algorithm, a priori
(which for me usually meant on grid-paper) hunting for some way of faking a
sufficiently accurate reciprocal using adds, shifts and at most 2 muls (per
scanline, cause per pixel even a single mul was madness, obviously). With what
I knew back then, probably go for a 2nd degree polynomial that might be
sufficient to at least give the impression it was doing better than naive
bilinear? :) Had I been aware of Carmack's (objectively crazy) inverse sqrt
hack, I would probably have started looking in that direction (abusing the
IEEE float spec on the bit level woohooooo).

Sure you could write a perspective correct triangle rasteriser with a div per
scanline, and it would be too slow, it would be a nice theoretical proof of
concept, but it would also be kinda useless if it turned out you couldn't make
the above crazy hacks run fast enough. They were a real hurdle that had to be
crossed or you might as well not bother. Also, why save the fun stuff for
last? ;-)

