There's a wikipedia entry in combining standard deviations (Standard_deviation#Combining_standard_deviations) but it's dense if you don't have a math background (I don't have one). The crux of it is, to compute the stddev of a set you need to compute the average, then sum of squares of the delta [ie sum([elem[i] - avg] for 1 .. i)]. You don't have the individual elements any more so you can't compute the sum of squares of mean_deltas, but using the stored stddevs, means and averages you can recompute that information out when computing the new stddev.
Well, it's a lot of easier to explain with a whiteboard. You're basically subtracting out the information you don't have based on the stddev/mean-aka-avg/nelems data you do have, you're subtracting out infinite series and it all works out perfectly.
numsum1 = SUM(nelems[i] * (stddev[i]^2 + mean[i]^2))
numsum2 = SUM( nelems[i] * mean[i] )^2 / SUM(nelems[i])
combined_stddev = SQRT((numsum1 - numsum2) / SUM(nelems[i]))
* Once you create an RRA (archive file) you can't modify it to add or remove metrics, or change their properties. This makes them relatively inflexible.
* Updating RRAs is I/O heavy. Every time an update comes in, the OS must read, modify and write a page.
* RRDcache mitigates this somewhat by deferring flushes, but there are diminishing returns to this (eventually the number of writes coming in will cause the cache flush and filesystem metadata update rate to exceed the maximum IOPS available), and you risk data loss in the event of a power outage or the OOM killer kills the process.
Time-series data access patterns tend to be write-heavy. Storing first in an append-only log is a big win here; Cassandra and MySQL are both good choices, though you do have to think about the schemata first. And disk is so cheap now that expiration can be an afterthought.
A simple tar + gzip is all you need to flush to disk, at the frequency of your choice. It turns out rrd write operations are safe enough to do this without corruption. And the IO cost is minimal compared to rrdcache: rrd data compresses extremely well.
But more importantly, you don't care. This will give you 3 orders of magnitude better write throughput with 2 hours of work. The savings in engineering time alone will buy you 1000x the ram you wasted!
With that kind of achievement, my advice would be leave over-optimization for later and buy yourself a drink :)