
Intel’s first Optane SSD: 375GB that you can also use as RAM - xbmcuser
https://arstechnica.com/information-technology/2017/03/intels-first-optane-ssd-375gb-that-you-can-also-use-as-ram/
======
jpalomaki
Somebody linked an "Intel to mislead press on Xpoint next week" article from
SemiAccurate to another thread on the same topic. Interesting read and adds
some context to the announcement. [http://semiaccurate.com/2017/03/10/intel-
mislead-press-xpoin...](http://semiaccurate.com/2017/03/10/intel-mislead-
press-xpoint-next-week/)

~~~
rayiner
By "broken" SemiAccurate seems to mean "only twice as fast, 10x lower latency,
and 3x the endurance of NAND." So who is misleading whom?

~~~
RachelF
Probably just marketing hype.

Some numbers:

Modern DDR4 RAM quad channel: 50,000MB/s

Optane SSD: 2,000MB/s

Also nothing on the PCIe bus is going to have a latency of RAM to the OS,
unless mapped as RAM.

~~~
patrickg_zill
Using memtest86 on dual core Opterons got a measurement of about 2200
Mbytes/second, I think, with DDR1 PC3200 ECC RAM.

I think you could get a lot of work done with a 750GB RAM (2 of these, or one
coming later) and dualcore Opteron setup.

~~~
mrb
DDR1 PC3200 single channel should give you 25.6 GB/s in theory.

~~~
undersuit
Isn't the PC component an upper limit on performance of a DIMM? PC3200 means
3200MB/s. While a single DDR3 1600 PC12800 DIMM would max at a theoretical
12.8GB/s.

Expand the tables here: [http://www.crucial.com/usa/en/support-memory-speeds-
compatab...](http://www.crucial.com/usa/en/support-memory-speeds-
compatability)

~~~
mrb
Err I made a mistake, you are right. PCx means a limit of "x" per channel. My
brain read PC3200 as "DDR-3200" ie. 3200 MT/s.

High-end server CPUs such as Intel Broadwell support quad-channel DDR4-2400
PC19200 which is 76.8 GB/s, a lot more bandwidth than the Octane.

------
ccleve
This is a big deal.

We've always made a distinction between memory and disk. Much of computer
science is about algorithms that recognize the difference between slow,
persistent disk and fast, volatile RAM and account for it somehow.

What happens if the distinction goes away? What if all data is persistent?
What if we can perform calculations directly on persisted data without pulling
it into RAM first?

My guess is that we'll start writing software very differently. It's hard for
me to predict how, though.

~~~
Const-me
To some extent, that change has already happen.

The RAM is no longer fast: unless cached, it takes around 150 CPU cycles to
access the RAM.

The RAM is no longer byte addressable. It’s closer to a block device now, the
block size being 16 bytes for dual-channel DDR, 32 bytes for quad channel.

Too bad many computer scientists who write books about those algorithms prefer
to view RAM in an old-fashioned way, as fast and byte-addressable.

~~~
xorblurb
> the block size being 16 bytes for dual-channel DDR, 32 bytes for quad
> channel.

For most practical purposes I believe in x86 computers, the block size to
consider should be at least a cache line, so 64 bytes.

------
gjm11
Originally Intel claimed that this new technology would offer 1000x shorter
latencies and 1000x better endurance than NAND flash, and 10x better density
than DRAM. The figures they're quoting now are more like 10x shorter latencies
and 3x better endurance (compared with flash), and 2.5x better density
(compared with RAM).

The article linked here says "3D XPoint has about one thousandth the latency
of NAND flash" but I don't see any actual evidence for that. The paragraph
that says it is followed by a link to actual specs for a "3D XPoint" device,
saying: "the Intel flash SSD has a 20-microsecond latency for any read or
write operation, whereas the 3D XPoint drive cuts this to below 10
microseconds." which sounds to me more like a 2x latency improvement than a
1000x improvement.

So I ask the following extremely cynical question. Is there any evidence
available to us that's inconsistent with the hypothesis that _actually there
is no genuinely new technology in Optane_? In other words, have they
demonstrated anything that couldn't be achieved by taking existing flash
technology and, say, adding some redundancy and a lot more DRAM cache to it?

[EDITED to add:] I am hoping the answer to my question is yes: I'd love to see
genuine technological progress in this area. And it genuinely is a question,
not an accusation; I have no sort of inside knowledge here.

~~~
mankash666
I'm from the nand flash industry. There seem to be a few fundamental
improvements in XPoint. For one, achieving a 3X endurance improvement while
keeping the same process size (dimensions of the memory cell) is new.

Given that XPoint is byte addressable is rather impressive as the circuitry
and metal layers (wires) needed for this is a lot more than page addressable
nand.

The true test is when they connect it directly to DIMMs versus the PCIE bus.
Latency numbers there may further prove fundamental improvements in
technology.

~~~
amygdyl
I think the byte addresability is a software layer from ScaleMP.

I have no idea, but some concern how that might affect latency.

------
sologoub
At previous employer, we built a system using Druid as the primary store of
reporting data. The setup worked amazingly well with the size/cardinality of
the data we had, but was constantly bottlenecked at paging segments in and out
of RAM. Economically, we just couldn't justify a system with RAM big enough to
hold the primary dataset. As the result, we had to prioritize data
aggressively, focusing on the more recent transactions and locating them on
the few servers with very high RAM that we did have. Historic data segments
had to go through a lot of paging in/out of RAM. User experience on YTD (year-
to-date) or YOY (year-over-year) reports really suffered as the result.

I don't have access to the original planning calculations anymore, but 375GB
at $1520 would definitely have been a game changer in terms of performance/$,
and I suspect be good enough to make the end user feel like the entire dataset
was in memory.

~~~
Dylan16807
Make sure you're looking at updated prices for ram too. 16x16GB of registered
ECC DDR3 is about the same price and enormously faster.

~~~
sologoub
Sure, but I believe we were limited by the available chassis to a lot lower
than 16 slots.

~~~
Dylan16807
Well the first google result for "1u 16 dimms" is a refurbished
chassis+motherboard+PSU for a hundred bucks. Brand new costs more but not
terribly so; the main cost is the ram whether you go 8 slots or 16.

These SSDs have situational uses but unless you want 10+ TB in one server you
can get a system with >50% as much actual RAM for the same price.

~~~
sologoub
It's not the cost. We ran standardized chassis, so whatever our ops had is
what they had...

------
ChuckMcM
Yay! Nice to see these things becoming more real. The choice of U.2 is
interesting, it might force wider adoption of that form factor.

This is definitely going to change the way you build your computational
fabric. Putting that much data that close to the CPU (closeness here measured
in nS) makes for some really interesting database and distributed data
applications. For example, a common limitation in MMO infrastructure is
managing everyone getting loot and changing state (appearance, stats, etc).
The faster you can do those updates consistently the more people you can have
in a (virtual) room at the same time.

~~~
wtallis
The U.2 form factor has had no trouble catching on in the datacenter, which is
where these first Optane SSDs are intended to be used. Intel is also offering
standard half-height half-length add-in cards, and M.2 is simply too small to
accommodate much 3D XPoint memory.

------
piinbinary
With drives like this, the approach of "throw more hardware at it" continues
working for databases to the point where most database loads in the world can
be handled on a single machine.

~~~
jrockway
Scaling beyond what fits inside one computer is a reasonable concern that this
addresses, but ultimately that one computer can be sucked up by a tornado at
an inconvenient time, so distributed systems will always be necessary for
availability and durability.

~~~
yjftsjthsd-h
Resiliency, yes, but maybe not performance. Still a great win :)

------
olavgg
I wonder what the performance of these are with PostgreSQL's pg_test_fsync,
which is one of the proper tools to benchmark a SSD. I get 4000 iops with my
Intel 320 SSD and 9000 iops with Intel S3700. For comparison, Intel 600p maxes
around 1500 iops, and Samsung 750 Evo at 400 iops.

~~~
snaky
"Why IOPS Suck and Everything You Know About Them is Probably Wrong!" \-
[https://www.youtube.com/watch?v=cEb270L5Q1Y](https://www.youtube.com/watch?v=cEb270L5Q1Y)

------
xt00
The smart thing that Intel is doing is making stuff that they know big cloud
providers like AWS etc. will pay crazy amounts for and buy in huge volumes.
The "use it as RAM" is incredibly valuable -- especially to bring down costs
for databases. For example, running a database with an allocated 32GB of RAM
is pretty expensive per month.. And if somebody like AWS sold a cheaper DB
instance version that ran from this drive as its memory (or was smart paged),
then that could bring down the cost of allocating huge databases to memory
with a performance hit that many people would be willing to take to save the
money.

~~~
tankenmate
64 bit machines (given the right controllers etc) should be able to map the
whole 375GB / 750GB directly into the memory space. You'd almost certainly
need a kernel driver, similar to the way a graphics card is mapped into the
address space but isn't treated as regular RAM. With the right driver you
could just mmap() the address space.

~~~
wtallis
Intel's strategy here is actually to continue using the standard NVMe
interface for Optane SSDs, and to offer a hypervisor that does the job of
presenting to the guest OS a pool of memory that is backed by a combination of
DRAM and 3D XPoint.

------
Animats
The big limitation of this device is wear. "Optane SSDs can safely be written
30 times per day", says the article. That implies a need for wear monitoring
and leveling. Although you can modify one byte at a time, the need to monitor
wear implies that just memory-mapping the thing isn't going to work.

Wear management could be moved to hardware, though, using a special MMU/wear
monitor/remapping device. If you're using this thing as a level of memory
below DRAM, viewing DRAM as another level of cache, something like that would
be necessary. That's one application.

This device would make a good key/value store. MongoDB in hardware. Then you
don't care where the data is physically placed, and it can be moved for wear
management.

~~~
viraptor
It doesn't say that the wear leveling is not present inside the box already.
It could mean rewriting the whole storage 30 times a day effectively.

~~~
Animats
If writes can be done at the byte level, the bookkeeping and indirection for
byte level wear leveling would take more space than the data. If wear leveling
is required, it has to be done on blocks of some reasonable size. They might
be smaller than for flash, though.

------
0x0
What's the endurance like if you actually use it as ram? How many times can
you do a "label1: inc [rax]; jmp label1" loop having rax pointing to a
particular byte address on the SSD? (With GHz CPUs, wouldn't that mean giga-
writes per second? Isn't NAND rated for 10k-100k writes total, and if this is
rated for 1000x more than that, you'd still hit a 10m-100m total writes in
like a second?)

------
SergeAx
I wonder why no one mentioned application of this technology in mobile phones.
Most obvious case: it will be posible to bring entire system from hybernate
state while user pulling phone from the pocket and pressing "awake" button.
Power and computing cost of hybernating/restoring system would be slightly
north of zero, which leads to dramatic increasing of battery life.

Not to mention size factor and lower power consumption of chip itself.

------
floatboth
Meanwhile, the Raspberry Pi is using a garbage microSD controller that
corrupts cards… the divide between cheap and high end stuff is just
mindblowing these days.

Also, normal (NAND) NVMe M.2 SSDs are still TWICE as expensive as good old
SATA ones, at least in my country… And they want to push Optane into the
consumer space later this year… who even needs that much performance at home?

~~~
adventured
> who even needs that much performance at home

In the next ten years (closer to ten than not) we'll need it for massive and
or hyper intricate 4k VR worlds, the assets of which you won't want to
download every time you load up the world/s. You'll want to hold as many of
the assets locally in something extremely fast, if not ram then the next best
thing. That is until we commonly have 10gbps plus to the home.

~~~
floatboth
RAM and VRAM these days are huge, and SATA is definitely fast enough to stream
assets from. Heck, I load most of my games from an old 2.5" spinning HDD that
I pulled from a dead laptop.

I think more improvement is needed on the VRAM side than on the disk side, so
HBM2 hype > Optane hype :)

------
willdehaan
It would be useful to move some SQLd indices to an Optane drive with DB rows
on arrays of m.2 flash drives. SQL consolidation/redundancy over network out
of band, slowly. This to scale somewhat cheap OSS SQL clusters.

This starts to make sense in a single server node with enough 4 channel pcie
interfaces. AMD Naples?

------
faragon
So finaly we're going to get from Intel what HP promised years ago as
"memristors".

~~~
petercooper
Memristors theoretically go further than mere storage. They enable the idea of
data storage and processors eventually becoming the same thing. I hope they
continue with the work because some of what I was reading about it was
fascinating - rather than have caches, you could, in theory, write temporary
routines that ran in physical locality to the data they had to process.

~~~
Dylan16807
You can already do that, it's just that dumb storage can be packed more
densely than smart storage.

Memristors can make things more dense, but that applies to both options.

------
aikorevs
Tried to look up modern memory latency numbers but could not find. Because
"numbers every programmer should know" are about 10 years old as I understand.

------
gbrown_
It _feels_ like Intel are putting this out in lieu of actual hardware. As
always remain skeptical until real silicon appears on the market.

~~~
sciurus
Isn't the article about real silicon appearing on the market?

"Initial limited availability starts today, for $1520"

~~~
gbrown_
The "limited availability" is almost certainly for those with per-arranged
deals with Intel and 2H is almost always Q4. It's not to say this isn't an
interesting product but I've a feeling Optane won't live up to the hype. But
I'm reserving judgment until it gets into the wider market/ has more real
world exposure.

------
nimos
Seems like this would be ideal in cases where you are waiting for file sync in
multiple locations which I assume a lot of banks/corporations do.

Interesting it seems to be marketed as cheaper memory. You'd think at first
they'd try and rip super high margins out of banks/corps by selling it as
"persistent" memory.

Although I guess if your waiting for file writes in multiple locations the
network overhead makes the actual write sort of irrelevant.....

------
frozenport
I would opt for a more conventional solution, you can get 256GB of ram for
under $2k, and then enable write caching.

~~~
comboy
/dev/null is also pretty fast

But for use cases where data matters, fsync seems like a reasonable idea.

~~~
ori_b
You can also attach batteries to the RAM. This is already a solution people
ship in production.

~~~
comboy
Any references? I tried to find something but I failed. I'm assuming we are
talking about something different than UPS + going to sleep? It seems very
tricky to just keep RAM powered.

~~~
pkaye
They are called NVDIMMs. One example [http://www.vikingtechnology.com/arxcis-
nv](http://www.vikingtechnology.com/arxcis-nv)

I think some of them are protected by both a battery and flash memory for long
term backup.

------
mrfusion
Is this based on memristors? That's pretty amazing. I thought hp owned that.

------
dgudkov
Ramdisks are back. This time persistent.

~~~
penagwin
I was under the impression that this is the opposite. It's a 'disk' (ram disks
are just storage that can store files, so I'm pretending disk means it stores
files) like a SSD that's fast enough to act like ram?

