If you want to read something more approachable that can serve as a gateway to this, I highly recommend Jon Stokes' Inside the Machine. It goes over microprocessor architecture in a very approachable way, and contains a good chapter on the memory/cache hierarchy that is sort of the light version of a good chunk of Drepper's paper, or at least will be a big help in making sense of it. If you've never been close to the metal, or you want to catch up on newer developments, consider checking it out.
Also, here's Drepper's paper compiled into a single PDF, which I think was edited a little later than the LWN publication and might have had some minor errata fixes: http://www.akkadia.org/drepper/cpumemory.pdf
As much as I liked Stokes's articles in Ars, I think programmers should probably skip Inside the Machine and go straight to Hennessy and Patterson. Inside the Machine is loaded with analogies that I find annoyingly distracting; I'd rather learn how a computer actually works than learn an analogy about how it works.
Also, here's Drepper's paper compiled into a single PDF, which I think was edited a little later than the LWN publication and might have had some minor errata fixes: http://www.akkadia.org/drepper/cpumemory.pdf