
A novel data-compression technique for faster computer programs - pps
https://techxplore.com/news/2019-04-data-compression-technique-faster.html
======
foota
Interesting paper. The original concept is at
[http://people.csail.mit.edu/sanchez/papers/2018.hotpads.micr...](http://people.csail.mit.edu/sanchez/papers/2018.hotpads.micro.pdf),
and the follow up paper the article is written about is
[https://people.csail.mit.edu/poantsai/papers/2019.zippads.as...](https://people.csail.mit.edu/poantsai/papers/2019.zippads.asplos.pdf).

Afaict the gist of the hotpads paper is this:

Dump the existing way we organize processor caching, instead make it look like
generational gc where objects can move between different caches, and pointers
can be re-written to point to the new cache level that the pointee is in.

~~~
acqq
I don't know if I understood it correctly, that their tested implementation
depends on the circuits that operate faster than the cores:

"We have written the RTL for these circuits and synthesized them at 45nm using
yosys [55] and the FreePDK45 standard cell library [23]. The compression
circuit requires an area equivalent to 810 NAND2 gates at a 2.8 GHz frequency.
The decompression circuit requires an area equivalent to 592 NAND2 gates at a
3.4 GHz frequency. These frequencies are much higher than typical uncore
frequencies (1-2 GHz), and a more recent fabrication process would yield
faster circuits."

~~~
Dylan16807
Cores are usually capable of more than 3.4GHz, and only run slower for power
reasons. So a tiny circuit at that speed wouldn't be a problem.

But they didn't give enough information to say how much headroom there is,
with no details about how shallow/deep the circuit is or voltage or...

~~~
acqq
> and only run slower for power reasons.

And if there weren't power limitations we'd already have cores using even more
GHz. So a solution which depends too much on having something running faster
could be not too relevant to be even used in practice. That's why I pointed to
that assumption, which is I think something which has to be understood before
the solution can be considered usable.

~~~
Dylan16807
600/1400 gates is _so small_. If that gave even a .01% boost it would be worth
the energy. It's basically zero in the total power budget, even at a high
clock.

------
sp332
Rob Matheson writes for MIT's Technology Review, which has a reputation for
massively over-hyping anything done by MIT researchers. Just take the
performance claims with a grain of salt until someone who doesn't work for MIT
writes about it.

------
kannanvijayan
This was interesting to read. It seems like a sophisticated, hardware-enabled
generational GC on top of a managed heap.

It reminds me somewhat of an optimization that my colleague Brian Hackett
implemented on the Spidermonkey JS engine, which would use runtime-type-
tracking infrastructure to discover objects that had specialized constituents
(i.e. a slot was always a boolean, and had never not been a boolean).

The system would notice this and then transition the object to a new layout
with non-value-boxed entries where appropriate. Of course this included de-
optimizations hooks to allow objects to transition back to the general "boxed"
layout if mutations to the object de-specialized the slot.

This technique delivered some significant performance improvements when it
came to computationally heavy, type-stable code (such as what we find in the
Octane benchmarks). It wasn't as effective on type-unstable code.

I'd expect that for most traditional standalone programs written in a
statically typed, managed language (e.g. Java) - this sort of approach has a
lot of promise.

------
_wmd
Paper:
[https://people.csail.mit.edu/poantsai/papers/2019.zippads.as...](https://people.csail.mit.edu/poantsai/papers/2019.zippads.asplos.pdf)

Seems this cannot be read without first understanding "object based memory
hierarchies", which is new to me. This coauthor's page has a ton of related
papers:
[http://people.csail.mit.edu/poantsai/](http://people.csail.mit.edu/poantsai/)

------
jdnier
This reminds me of "RAM Doubler" from the mid 90s on 68030/40 Macs.
[https://tidbits.com/1996/10/28/ram-
doubler-2/](https://tidbits.com/1996/10/28/ram-doubler-2/)

I do remember the joy have doubling 16MB of RAM to 32MB.

~~~
unforeseen9991
yes!

And that scene from Johnny Mnemonic was priceless for me:

[https://www.youtube.com/watch?v=ftzx_EdLV_s](https://www.youtube.com/watch?v=ftzx_EdLV_s)

------
Digit-Al
What happens if you're using a functional language?

