L1 cache: a squirrel (1kg)
L2 cache: a mid-sized cat (~5 kg)
RAM: a tall, well-muscled man (~80kg)
Hard disk: one hundred blue whales (100 * ~130 metric tonnes)
This is what I mean when I say "It doesn't matter how fast your language is, you're just racing to get to wait on I/O faster."
P.S. Let's extend the analogy to include two other common factors:
Typical round trip to database: the combined mass of every ship, plane, and person in the USS Nimitz' air group... with room for another two fleets or so after you're done (150 ms ~ 1.5 million metric tonnes)
Time for user's computer to render a web page of medium complexity: worldwide demand for cement in 2009 (2 seconds = 20 million metric tonnes)
But please, spend time optimizing your string concatenation... because that is going to help ;)
[Edited: Revised and extended because I introduced a conversion error or two and then compounded them. Word to the wise: mental conversion to fractions of blue whales not advisable before morning coffee.]
L1 - There is a sandwich in front of you.
L2 - Walk to the kitchen and make a sandwich
RAM - Drive to the store, purchase sandwich fixings, drive home and make sandwich
HD - Drive to the store. Purchase seeds. Grow seeds..... .... ... Harvest lettuce, wheat, etc. Make sandwich.
(I'm using 83ns and 14ms, as in the article, and half an hour for RAM and one year for HD.)
Disk is largely something you keep around so you can handle really large things (like pictures and video) and so that all your data doesn't go 'poof' when you hit the power.
(the other side of that, of course, is that if you want your data to be in good shape after the aforementioned power loss, then yeah, your writes will block on disk speed. But for most people, the 'sane but not correct' default of ext3 and the like is good enough.)
But it does matter how small your runtime (and inner loops) are, if you want to live in the L1 and L2 cache and avoid paying the RAM latency penalty.
Maybe you meant to start with 1 kilogram? : D
L1$: a second
Hard disk: more than five months
Great to hear it in different formats though. I can stare at numbers all day and still not really get it, but the difference between a second and 5 months is as subtle as a punch in the nose. Good stuff.
Take a look at the above scale: page rendering times are going to dominate everything else. There are simple, repeatable, effective ways to reduce them. See the presentations from the YSlow guys.
Unfortunately most webapp developers don't use reasonable cache control headers. Most php apps, if your proxy does a HEAD to see if it should re-pull content, render a full page and throw it away except for the headers. (this may be false now, but when I did this, they were using php3. In php3, to handle HEAD requests properly you'd have to actually write code to handle it, which most programmers did not.)
Still, my experience has been that using something like squid gives you a pretty massive performance advantage, even when your webapp is uncooperative.
I have been a "hero" many, many times by doing exactly this, unfortunately for the assessment of "average" programming knowledge in corporate IT.
Using StringBuffer, although it looks a bit crappy in the code (IMHO), actually does result in massive gains in terms of performance.
For my latest work project (I do scientific computing), I realized it's easier to do it on my laptop instead of the workstation. Since my laptop has an SSD, I can just use the filesystem as my database. This means that I can have have millions of files (literally, millions) lying around and process them using the good old Unix shell. It greatly reduces the development time compared to using a database. Just for giggles I tried doing this on a machine with a hard drive, and it was more than one hundred times slower.
Right now I am back to spinning disk and time machine for hourly backups. I'm highly considering selling the replacement Intel SSD when I get it back. Some things like loading programs, starting up and shutting down are much faster. When it comes to installing stuff, extracting files, or writing data to disk, this 7200 rpm 500gb drive is faster than the SSD.
Also worth taking note is the $300+ for the 80 GB Intel SSD could buy 4 500GB laptop drives.
SSD is great, the problem is that the good SSD costs something like $15 per gigabyte, and good registered ecc ddr2 ram costs just over $20 per gigabyte. Sure, in applications where consistency across power-loss events is a huge deal, ssd is the right answer, but for most applications, buying a whole lot of ram is often faster and not that much more expensive.
Could I have re-written the code by messing around with inodes and other low-level details so that it accessed the files in physical order? Probably. Was it worth my time, rather than using an SSD? Hell no.
I agree that SSDs are still a tad expensive for the average Joe. For most hackers, considering that we spend most of our work hours in front of a computer, I feel that the added productivity from an SSD is easily worth the investment.
the idea is that if you have enough ram, you only need to read the files from disk once. after that, the files are in ram cache. Once the files are in cache, at least for reads, it doesn't matter how spread out on disk they were.
And yeah, you do need to read the files from disk once, and that is slow; thus you often see a 'warm up' effect on servers. hitting a new page is often slower than it is for the second person who hits that same page.
Yes, running on a SSD takes the entire filesystem much closer to ram speeds. However, you are doing so at almost ram prices. (I'm speaking of good SSDs, like the X-25E; which comes to something like $15 per gigabyte; the not so good SSDs have problems of their own. I have a SSD in my laptop right now that is branded by one of the gamer ram companies, i forget which one. It was pretty cheap, under $2 per gigabyte. It's pretty nice for reads, for writes, sometimes it is good, but often writes are worse than spinning disk.) The advantage of just buying the ram is that a good virtual memory management system can automatically optimize to keep the data you access most often in ram.
Like I said, i use a SSD in my laptop, a cheap brand and it's small, my laptop doesn't need a lot of storage, so the cost is reasonable, and I use a journaling file system, so writes are cached and the slow sub-cell size write speeds of the cheap SSD aren't a huge problem. I'm just explaining why in my servers, I prefer to go with a whole lot of ram, and then slow, cheap, and large SATA, rather than less ram and expensive SSD.
My of us are also stuck with mission critical legacy 32 bit servers that we can't just take down and upgrade so easily and licenses for enterprise versions of some db's that can handle assloads of ram don't come cheap. SSD's are much cheaper and a no brainer upgrade for that aging db server that just needs to be faster. The X-25E smokes here letting you get that speed without needing that expensive license.
There you can find more nice figures.
It would be interesting to see some account of the cache size / die size cost / performance trade-offs.
Nehalem is about 70% cache. Most of it is the shared L3 between
cores. There are physical limits to how large a cache can be and
still run synchronously. The L1 is still tiny (64k, split between
instructions data), and it's really not feasible to make it larger
without affecting clock speed. But if you drop just a little bit and
pay a latency cost, you can stick a 256k unified cache on each core.
Then they all talk to the 3M shared "uncore" cache.
But to first approximation, a modern "CPU" is entirely SRAM.
Not like I have a hope in hell of getting anything I write even into 6MB .. sigh.
The last I heard, both L1 and L2 cache were sram based which used 6 transitors per bit. I also remember that ram takes either 2 or 4 transitors per bit + a capacitor. If L1 and L2 cache take the same number of number of transistors per bit, what is the difference?
Sure, CPU contention can slow things down, but it's usually not the 'fall off a cliff' performance degradation that hitting disk (rather than hitting ram cache) is.
But seriously, if you're going to use a GIF, at least make it animated.
It's that kind of speedup I'd like to see -- disk-heavy tasks that don't peg the CPU.
This will get figured out and make a big difference. But still no comparison to RAM.