With voxels, positions are inherently stored, and guaranteed to have some cache locality if they are uncompressed. Thus, GPUs are really good a chugging through large amounts of bitmaps, which is what I am doing (only rendering to bitmaps). With teraflops of compute power now, you can do trillions of flops every second, and in my case I can render about 1-2 billion voxels per second.
Additionally, doing procedural generation with polygonal surfaces is much more complex (I've done it, I know). Implementing stuff like voronoi patterns (used for the rocks) is trivial with voxels. Not the case with polygons, unless you are just wrapping voxelized data. Its the difference between evaluating a graph at every point, and calculating the graph at only the needed points. Voxels are implicit, polygons are explicit.
Really, you can think of VQ like a really fancy graphing calculator (in fact, all of the structures are just an extension of superellipsoids) (http://en.wikipedia.org/wiki/Superellipsoid)
My best recommendation - dive in and start coding. You are your own best teacher. :)