Rasterization really doesn't require that much work, but you need to be able to do a world/view-transform, clip triangles/quads, and do perspective correct texturing. The most sophisticated bit of that is a clipping algorithm, which is really easy to implement.
Ray tracing requires you to generate a ray per pixel (essentially the opposite world/view-transform), determine ray/box intersections, and based on the intersection coordinate of ray and box determine a texture coordinate.
As someone who has done both, I would say the two procedures are pretty the same level of complexity (if you stay away from bilinear texture interpolation), but I admit that raytracing feels easier to implement as you avoid the clipping and perspective correct texturing.
However, The Minecraft world is a uniform grid of "boxes", so it contains a lot of quads leading to potentially huge amounts of overdraw which quickly becomes infeasible for a software-rasterizer. So if you wish to rasterize in software, you'll need to do a bit of additional work to avoid drawing a lot of hidden box-sides (ignore shared sides), and you'll never get overdraw down to 0 unless you use additional screen-based data-structures.
On the other hand, the author had to implement a raycasting algorithm on the uniform grid for the raytracer to be efficient. This is actually also a little bit painful.
So for that reason, the ray-tracer is definitely the right decision here.
On a related note, I tried to implement raycasting on a uniform 3D grid on a 486/66 Mhz in 1996... Got around 2-5 FPS in 320x200. So it was completely infeasible back then.