Hacker News new | comments | show | ask | jobs | submit login

To some extent, that change has already happen.

The RAM is no longer fast: unless cached, it takes around 150 CPU cycles to access the RAM.

The RAM is no longer byte addressable. It’s closer to a block device now, the block size being 16 bytes for dual-channel DDR, 32 bytes for quad channel.

Too bad many computer scientists who write books about those algorithms prefer to view RAM in an old-fashioned way, as fast and byte-addressable.




> the block size being 16 bytes for dual-channel DDR, 32 bytes for quad channel.

For most practical purposes I believe in x86 computers, the block size to consider should be at least a cache line, so 64 bytes.


You need a clean abstraction when describing elementary algorithms at an undergraduate level. A lot of work has been done on cache-oblivious algorithms that respect the memory hierarchy, including from the authors of the classic CLRS book.


Remember the "3M" computer ideal? Mega-Herz, Megabyte, Megapixel, for speed, memory, screen, to make up a new capability of workstation?

https://en.wikipedia.org/wiki/3M_computer

I now think the next target ought to be latency and power defined, in particular with IOT requirements.

Microseconds, Milliwatts, Millions of endpoints.

or, for storage, Microseconds, Millions IOPS, Multi-Parity

or for computer, efficiency, IOT goals:

Milli-watts as a constraint for a benchmark of compute Millions of endpoints / processes or cores on a network topology Microseconds as a measure generically of access to resources, whether memory is sat on another node, or local store, the delay ought to be inside the same order of magnitude as a target.

My purpose in all of that, is, should there - in fact, can there be - consideration of architecture and topology that avoid .. rather can we aim for linear cost in addressing complexity at all, as a goal, or has that been lost already?

I mean "that" and "lost" very vaguely, being non expert, but my question is really should I imagine that there are no effective gains to be had, designing for a IOT style or massively networked future, from the way memory is addressed, or has the complexity that we have, been introduced out of necessity, so will be here to stay, for practical reasons, so the idea of low latency, low power, IOT "grid computing" on a ad hoc basis, is smoke in my pipe?


> It’s closer to a block device now, the block size being 16 bytes for dual-channel DDR, 32 bytes for quad channel.

Don't forget the 8-deep prefetch buffer. The real block size is more like 128 bytes for dual channel.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: