Hacker News new | comments | ask | show | jobs | submit login
Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM [pdf] (ethz.ch)
66 points by espeed on Jan 7, 2018 | hide | past | web | favorite | 13 comments



Very interesting. At first, I thought this was pushing the processing closer to the data with some new special CPU logic inside the DRAM. I'm not sure, but it seems to be available in out of the box current memory.

The fact that Intel + NVidia are present in the paper is very exciting too. Does this mean we could see the improvements with a simple software update?

This end of 2017 and start of 2018 is very weird. We got industry-wide performance degradation with meltdown, and now we are seeing industry-wide performance upgrade with another hardware trick!


I don't think it's available out of the box. They say it would require just 1% additional circuitry and describe doing experiments on a circuit simulator (quoting from memory, PDF isn't loading at the moment), which doesn't sound like something available already on commodity RAM, quite.


Huh? If I'm not mistaken, Ambit is not a new development. I remember reading about it. (or at least some other commodity DRAM bitwise accelerator)


Twenty years ago, Sun had special RAM in their Elite3D cards to do write-only alpha blending and z-buffering (saturated updates), instead of the much slower read-modify-write: one PCI write instead of one read + CPU computation + one write. That might have been UPA or even Sbus instead of PCI, but you get the idea. ATI and E&S later used it, too.

http://www.michaelfrankdeering.com/Projects/HardWare/p3DRAM/...

This looks like a variant of the Sun/Mitsubishi work, after a quick skim.


Having RAM chips being able to do bzero() internally would give a general speedup in a pretty wide range of applications. Might help security with cheap zero-on-free too.


malloc and free could also be hardcoded.


Reminiscent of HPE's comment at the In-Memory Compute Summit September 2017 - GSI's PIM paper would be interesting with a few changes. Also persistent memory will be announced most probably in May time frame which is required with GSI's which requires a non-destructive Read...,


As an unrelated sidenote, I feel really bad for Onur Mutlu, his thesis, Runahead execution, basically became a giant attack vector thanks to Spectre (and also perhaps Meltdown).


I don't think speculation is dead, but I do think the MMU/L1/L2 system will be turned on its head.

It needed to anyway for cloud workloads. The peer guest VMs should have never been able to flush my caches. Modern CPUs are still designed for DOS with some extra cruft tacked on the side.


Meltdown is fairly well handled (per-process tagging of TLB entries and the separate kernel vs userspace top-level mappings solves the problem without much perf impact).

For Spectre the problem is ultimately a failure to completely roll-back processor state. I don't want to downplay the complexity but the state of the cache needs to be tracked and if speculative execution triggered loading of cache lines where the speculation was incorrect then those cache lines need to be thrown away, effectively rolling back the cache. That shouldn't have a huge performance impact but it seems to require new CPU designs.


Intel CAT (Cache Allocation Technology) and similar helps a bit by subdividing the cache between trust zones. So if you have two VMs rented by different tenants and you have set up CAT to isolate them then they shouldn't be able to interfere with each other. Of course CAT only covers the lowest layer of cache so there may be still be attacks on stuff in L1/L2.

Edit: Paper on this subject: http://palms.ee.princeton.edu/system/files/CATalyst_vfinal_c...


To fully close the hole you would also need to load back in whatever cache lines were evicted by those speculatively loaded. Alternatively you could provide some buffer space for spec cache lines.

No matter what you do this will result in a slowdown, either from re-loading data or from "wasting" cache space that could otherwise have been used generally for buffers.


Looking at the title of paper at least he has some sense of humor.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: