
DDR4 memory: Twice the speed, less power - Mitt
http://www.theregister.co.uk/2012/09/27/jedec_ddr4_spec/
======
ajross
It's important to note that "twice the speed" refers to transfer speed, not
latency or cycle time. Modern DRAM technology has capped out at a cycle time
of about 30MHz, and isn't going to change much in the future. That time is
limited by the time required to precharge the bit lines to a voltage accurate
enough to measure the stored values against. These are big wires going across
the whole chip by definition, so they don't see improvements due to process
shrinkage. On a modern DDR3-1600 part, there are 1.6G potential transfers per
second, but a full random access cycle requires about 60 of these (10-10-10-30
timings are typical). The time taken to issue the command is 2 clocks, and the
time taken to read the data is 4. The rest is just idle waiting. So even
"infinitely clocked" dram would be only about 10% faster in the worst case.

~~~
codex
Is this 30Mhz value per chip, or per memory channel? Would it be possible to
increase total random access throughput by adding more chips to a channel and
pipelining more requests?

~~~
ajross
It's per "bank". Chips often are partitioned into multiple banks, each with
its own row/column array. This gives some parallelism, because an access to
one bank won't invalidate the already-loaded row in the others (reads of bits
in the same row as the previous one are faster because they are stored/cached
in digital logic -- this is why sequential access to DRAM is faster than
random acccess).

Obviously the width of the DRAM bus itself is a parallelism design:
fundamentally DRAM is always "one bit wide", you get wider busses by using
multiple devices and feeding them the same commands.

And you can get even more parallelism by using more than one DRAM channel.
Intel Desktop CPUs have 2x 64 bit channels, the LGA2011 server parts have
4x64, Apple's new A6 made news this week by moving to a 2x32 bit design.

But note that those are all bandwidth improvements. They don't do anything to
the time it takes to get a bit out of an idle DRAM part. DRAM latency is
largely fixed in the modern world, until we find a different way of storing
bits.

~~~
tisme
Or switch to more expensive but faster SRAM.

------
Scene_Cast2
If I recall correctly, memory manufacturers have ready to manufacture /
release DDR4 for some time now. The CPU companies are the ones that aren't
making the move until 2014.

~~~
bathat
Not being familiar with the DDR4 spec, I wonder how much performance benefit
DDR4 brings without increasing the size of the processor's cache line? As
others have mentioned, the latency for random access is quite high, and DDR
effectively just multiplies the bus width. For DMA and filling cache lines
DDRx is a win because you usually want to grab all those bits anyway (although
in those cases, the addresses tend to be contiguous anyway, so it's not as
slow as "random"). But if your cache grabs bunches of 512 bits, having a
memory that is only faster when transferring a contiguous group of 1024 bits
isn't an obvious speedup.

------
rogerbinns
For the folks that care about power consumption, there is already DDR3L and
DDR3U today that use lower voltages than standard DDR3. Intel's Ivy Bridge
chipsets can support DDR3 and DDR3L and I'm using the latter in my laptop.

------
programminggeek
Wasn't HP working on memresistors that were going to be way faster than
traditional memory, but they've pushed it back a year or two so their partners
could adjust their biz models? Were they waiting for DDR4 maybe?

~~~
Tuna-Fish
Memristors are aiming for flash initially, not DRAM. They probably don't have
the write endurance to deal with DRAM-like loads. Also, DDR4 is an interface
spec that has very little to do with the actual RAM.

~~~
bloaf
Yep, it was my understanding that they would use memristors in flash first,
then SSDs, then RAM over the course of a few years.

------
sitkack
This will be a huge boon to virtualization and machines with high multitasking
workloads (refilling the caches quickly).

Cache locality and streaming reads and writes are still as important as ever.

------
twodayslate
Not until 2014 :(

