Hacker News new | comments | show | ask | jobs | submit login
Memristor-based non-volatile memory matches DRAM performance (bbc.co.uk)
70 points by azernik on Jan 30, 2012 | hide | past | web | favorite | 45 comments

Very interesting video on the memristor. Long but worth it: http://www.youtube.com/watch?v=bKGhvKyjgLY

It's still not a replacement for DRAM, though. The article says it can only take a million rewrites before failing. Depending on what a computer is doing, I'll bet the main memory can easily hit that limit in days to weeks.

It is a replacement for SSD, not RAM (at least for now). Imagine your permanent storage being as fast as your RAM.

Do you think this will be a driving impetus for the move from SATA to PCI Express connected backing stores in consumer hardware?

PCI Express is past, so is SATA.

When you get to have L5 memory of nearly same speed and price as L4 (RAM) then whole architecture will have to change.

Forget contemporary memory busses, forget contemporary R/W mechanisms, forget contemporary databases.

Well, not necessarily. We went from 20MB HDs in the early '90s to 4GB RAM now, with not much change to most things.

Busses et al would be redesigned, sure, but contemporary databases, software, virtual memory techniques etc, not so much. You can just tune their "magic numbers" accordingly.

http://rethinkdb.com/ just called and they disagree with you.

The thing is that architecturally nothing has really changed from 70's. And even if it did, Oracle cannot just go and tweak some "magic numbers" (what would those be?) since the idea of spinning plates at the end of a long and perilous pipeline is at the core of any contemporary DBMS. To change this you cant just assign a junior developer for a week or two to "optimize" for SSD.

It is an monumental effort requiring that you throw away and/or revisit every presumption you made in the last 30-40 years that it took to build these DBMS.

It is similar to introduction of internal combustion engine to the world of horse and carriage.

At first glance, just mount the engine on the carriage and voila! It turns out that carriage was fundamentally changed to accommodate the internal combustion engine. To the degree that besides four wheels, modern ICE powered carriage looks nothing like old horse powered carriage.

http://rethinkdb.com/ just called and they disagree with you.

Yes. What about the rest 99.999% of the industry?

The thing is that architecturally nothing has really changed from 70's.

Which is my case exactly. Despite going from KB of main memory to TB, and from a few MB of hard disk to PB, "nothing has changed". What makes you think this case will be different?

And even if it did, Oracle cannot just go and tweak some "magic numbers" (what would those be?) since the idea of spinning plates at the end of a long and perilous pipeline is at the core of any contemporary DBMS.

I was talking about the OS level, not Oracle's. Tuning virtual memory and related pipelines.

And, no, the "idea of spinning plates" it's not "at the core" of DBs. Don't even know what you imply by this. That, for some reason, Oracle say wouldn't take advantage and run orders of magnitude faster on pure dynamic memory? Oracle --and all DBs-- already runs just fine on non spinning SSDs. And most DBs already have special tunings to keep the working set or even everything in the main memory, and never touch the disk. That the underlying storage is a spinning platter of some dynamic memory 100 or 1000 times faster will not matter much.

Is that really a possibility? I assumed the 6 Gb/sec (as Wikipedia claims) for SATA 3 would be sufficient bandwidth for any drive storage.

I'll admit, I've not followed this sort of hardware closely but my understanding was that SATA is set on its own track, apart from the other buses like PCI and USB.

DDR-3's bandwidth is about 20 times SATA-3's.

I'm sure people will make SATA SSD's.

There is an architecture revolution to be had, but it isn't as easy carving off a high speed bus from the CPU and turning your database speed up to a factor of 20.

If the write lifetime is a million writes, at 10ns/write you can burn that up in a millisecond.[1]

Maybe you trust your server hardware to last for three years.[2] You get about 38 writes per hour. I hope you aren't syncing a critical chunk of data once a minute.

So you have an insanely fast, nonvolatile store, but you can't just write to it willy-nilly. Let the engineering begin! (I'm sure it has for some people.)


[1] Malware can be expensive in a hurry. A couple quick loops and you can ruin 1000 blocks/second.

[2] If you have less than a few dozen servers you end up playing quality roulette. You have no idea which models from which vendors are going to hold up, so you make a guess based on reputation and price. Sometimes you are right, sometimes you are wrong. Sometimes you have a 50% mortality in 6 months. When you get a model that works well, by the time you know it lasts, you can't buy it any more. My strategy was always to buy a batch, run them in non-critical positions for 6 months to screen for early failures, if they were good, move the critical functions to them, but then get off of them before the age related failures begin. Three years was about my limit.

You can get a PCIe solid state drive today with speeds higher than SATA3 can provide. Here's one with a read speed of 12.5 Gbit/s.


Device controllers used to be divided between a northbridge (memory, video) and southbridge (pcie, sata, usb, etc), but in the latest generation northbridge components are integrated into the CPU.




So far. This will dramatically improve. Original transistors had a low MTBF but the chemical and electrical processes were tuned until this was a non issue.

This is literally like using transistors for the first time in the late 1950s.

This is the next wave of technology. It solves so many fundamental problems. I can't wait to have a few GiB of MRAM in a workstation.

Load/store architectures may go away due to this. Imagine 32Gb of CPU registers.

> Imagine 32Gb of CPU registers.

Sounds like the most expensive context switch ever.

The point is you don't switch contexts, you maintain parallel contexts.

If that were a win (having multiple sets of registers that you can "buffer flip" between) don't you think current CPUs would implement that with, say 8 or 16 banks of registers?

They do. SPARC64 has hardware contexts. Technically, the architecture wouldn't require this. A process's state would be a mapped linear segment of memory so you can have as many contexts as you can fit in RAM. There is no need then for traditional "save everything" context switching. You just move the CPU's execution context to a different area in RAM and the context is there.

Isn't that what Intel's hyper-threading does?

AFAIK the main problem with hyper-threading is cache contention (two hyperthreads on the same CPU thrashing each-other's cache).

Anyway, I fail to see the point of this discussion, TFA states that MRAM attains speed comparable to that of the DRAM, which is much slower than CPU cache (at least one or two orders of magnitude slower), so that won't go away just now.

Also, the article speaks of "write speeds" (whatever that means) of tenth of nanoseconds but says nothing of latency. I suppose there are no refresh periods, which might improve over DRAM a little. It all seems very vague so far, I'm looking forward for some more technical and all-encompassing performance numbers.

Modern CPUs have dozens to hundreds of registers. It's difficult to scale registers though due to fundamental technological limits.

Agreed on all points. This is a total gamechanger in terms of hardware and software architecture. Even if it follows the relatively-slow uptake of SSD's it completely revolutionizes the *aaS side of the net.

Amazon is a big player with AWS and this is the perfect opportunity for someone to come in and eat their lunch and change everything.

I don't see how you are going to fit those registers into CPU. This new kind of RAM will still be used for main memory.

The good news is we don't need hard disk any more. This will indeed change a lot of what we know about operating system.

Memristors have the possibility to perform logic as well.

I'm excited to see them used as neural networks http://www.eetimes.com/electronics-news/4088605/Memristor-em...

It only said it wad RAM speed, not register or cache speed. There are speed of light constraints in the memory hierarchy too surely?

Yes, it said it was as fast as DRAM, not SRAM (the latter is what is used for cache and registers).

A memristor is a smaller and simpler component than once register cell. Due to this, leakage and speed of light are less of an issue than the current techology. I wouldn't expect replacement for a decade though.

That may be, but OP was talking about having 32 Gb of registers and obsoleting load/store instructions. Current technology doesn't have 32 Gb of SRAM in registers, it has less than 1k. Even if the cost of SRAM wasn't an issue, I would be surprised if you could increase the size of the register file by 30,000,000x while maintaining the same latency and throughput characteristics.

I was the the OP. The CPU spends a lot of time moving shit around rather than doing work.

My point is to remove the distinction between the register file and the main memory so that the entire CPU's working set is linear and no copies are required, therefore drastically increasing speed.

When you do this, you lose all the cache control latency and context switch overhead, resulting in a much smaller and faster core, leaving plenty of space for 32Gb on die :)

No existing architectures will do this as they rely on the memory hierarchy. I'm talking about a new architecture.

> My point is to remove the distinction between the register file and the main memory so that the entire CPU's working set is linear and no copies are required, therefore drastically increasing speed.

That has been tried several times before. As long as small is "enough faster", small&fast+large+overhead beats large. (In really fast processors, active register values are lots of places, so they don't even access the register file except for values that haven't been used for a while.)

> When you do this, you lose all the cache control latency and context switch overhead, resulting in a much smaller and faster core,

Huh? Context switch overhead is time, not space. Cache control is negligible space.

> leaving plenty of space for 32Gb on die :)

Not yet you don't. None of this stuff is as dense as dram and DRAM is just now hitting 4Gbit. Since fast processors do take some space....

What will that new architecture look like? ... Since I can't seem to reply to your comment...so an 8 bit processor :-) It would be interesting to pair this with the new 100 core chips.

Like zero page in the 6502.

Memristors will have a major impact on hardware architecture in the near to medium term. They promise permanent storage not just without the costs imposed by SSD (much slower performance than DRAM), but also random access without the costs imposed by hard drives (variable seek times depending on how data happens to be laid out on physical discs).

IMHO the impact on software architecture is likely to be much less pronounced at first, because software ecosystems -- OS kernels, libraries, utilities, applications, etc. -- can evolve only gradually over a period of many years. Consider that Ken Thompson and Dennis Ritchie reportedly created the /usr directory because they ran out of space on a 1.5MB hard disk (!) more than 40 years ago (!), yet we're still living with this directory ( see http://lists.busybox.net/pipermail/busybox/2010-December/074... ).

No more RAM chip freezing to extract the encryption keys for Mr. Government Man...

This is actually a valid point, most modern encryption schemes rely on the fact that the RAM can't be dumped easily.

I guess you could keep at least some volatile memory for storing sensitive information such as encryption keys. Of course unless the rest of the MRAM is encrypted you may indeed leak potentially sensitive data.

Maybe dedicated hardware could encrypt/decrypt the RAM contents on the fly when the CPU or the devices access it, but it sounds costly.

Processor cache could easily hold the encryption keys. Now that memory controllers are integrated to modern CPUs, this could be achieved quite easily when revamping the architecture, adding a couple of special instructions.

No new instructions necessary: http://www1.informatik.uni-erlangen.de/tresor .

Interesting link. However it doesn't solve the security issues caused by non-volatile unencrypted system memory. I guess custom hardware could encrypt/decrypt the data going trough the memory controller on the fly.

Oh yeah, silly me, I had forgotten about that. However, probably a couple of specialized instructions or storage areas in cache or register may make it a little more easy to use.

"At present, the endurance of DRAM is effectively a lifetime of usage"

Liked the article, but that's a self-referencing statement

It means the lifetime of the system it's placed in. DRAM will last until you've thrown away all of the other bits of your system.

DRAM is also commonly sold with a "lifetime warranty", which means the lifetime that the RAM is being made at the factory and they have the reasonable ability to replace it.

It isn't. It introduces the term with quotes first: "endurance". Then this sentence defines what "endurance" means for DRAM.

Perhaps they meant a human lifetime.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact