Systems programming in any language would benefit immensely from better hardware...

gchadwick · on June 13, 2020

> Systems programming in any language would benefit immensely from better hardware accelerated bounds checking.

[Mostly discussed deeper in the thread already but felt it was worth bringing up directly as a reply to this]

This is an active area of development though still plenty of work to do. The Cambridge University Computer Lab have been doing research in this area in the form of CHERI: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/ this gives you hardware capabilities which are effectively pointers with bounds given out by the OS. Want to access something? You need the appropriate capability to do so.

ARM announced MTE last year, which whilst far less capable than CHERI begins to give you something (though it's targeted for bug hunting effectively rather than providing any actual security/safety properties): https://community.arm.com/developer/ip-products/processors/b...

ARM are also building a hardware CHERI implementation, Morello: https://developer.arm.com/architectures/cpu-architecture/a-p.... There's already CHERI implementations running BSD on FPGAs, this will take it to the next level with real silicon using modern ARM processors (maybe you could run android on it for instance?).

I think Intel have been looking along similar lines but I'm far less in touch with what they're up to so I don't have links.

zozbot234 · on June 13, 2020

> Systems programming in any language would benefit immensely from better hardware accelerated bounds checking. ... The world would be a lot safer if we had hardware features like cacheline faults, poison words, bounded MOV instructions, hardware ASLR, and auto-encrypted cachelines.

This isn't really true. The main performance cost of these safety checks (and largely of others such as overflow checking) is inability to optimize because the compiler needs to preserve partial results/states in case a fault occurs. The checks themselves are trivial.

CoolGuySteve · on June 13, 2020

Partial results like, say, a hardware exception firing instead of forcing the compiler to reason about it?

I want efence on steroids.

arcticbull · on June 13, 2020

To the maximum extent, bounds checking should be elided via greater compiler knowledge of what exactly is happening. This would leave arbitrary bounds checks limited to user input, in which case the vast majority of the performance penalty goes away right? "Bounds" are a higher-order programming language concept that I suggest may not have a place in hardware.

steveklabnik · on June 13, 2020

Compilers already do this. It would still be nice to have cheaper checks; compilers are not omniscient.

arcticbull · on June 13, 2020

Yeah, for sure.

I think I was having trouble envisioning what exactly a "cheaper" check could look like in hardware. Bounds checking is basically read length, subtract your index, and a conditional branch (potentially with a hint that it would succeed).

To do this properly in hardware I suppose you'd need a list of memory regions that are "live" and default the rest to "dead", though how many do you support? What does updating the list look like? Page tables are pretty slow to update, and those don't change too often. Array tables would be pretty gnarly, and impose a further penalty on context switching as they'd have to be thread-local and app-local.

I wonder if this is a case akin to spinlocks. Sure, I'd love a lock that doesn't busy-wait, but there's not really a cleaner solution -- in hardware or otherwise.

Maybe I'm just not seeing something obvious, though!

saagarjha · on June 13, 2020

You might find an almost-practical (not shipping yet, but should very soon) example helpful: https://community.arm.com/developer/ip-products/processors/b...

arcticbull · on June 13, 2020

That's brilliant, and I retract my statement completely. Thanks for sharing!

jabl · on June 13, 2020

In addition to ARM MTE, see also Cheri: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

monadic2 · on June 13, 2020

> The world would be a lot safer if we had hardware features like cacheline faults, poison words, bounded MOV instructions, hardware ASLR, and auto-encrypted cachelines.

Sure, the world would be a lot safer if we used microkernels, too, but the tech world has been obsessed with performance over all other characteristics for decades.

todd8 · on June 14, 2020

My first full time professional job (1976) involved writing assembly language for the TI 960, a process control minicomputer with a 64K data address space and a different 64K instruction space.

At first, I just thought it was odd to make the hardware more complex by having two address spaces. However, this prevented a common cause of difficult to find bugs in asm programming, and I came to appreciate that hardware could make programming safer.

Hardware architecture, programming languages, and compilers advanced rapidly, but safety always seemed to take a backseat and was left up to the programmers. I’m glad to see developments like Rust and I look forward to using it for a real project soon.

pjmlp · on June 14, 2020

You have it on CHERI, ARM and SPARC, it was Intel that screw up.

So make use of Solaris SPARC, iOS or Android 11 (ARM HMT is a requirement).

jcranmer · on June 13, 2020

You mean that Intel should invest in something like MPX?

my123 · on June 13, 2020

MPX is dead, and was so expensive that 100% software bounds checking was cheaper...

jcranmer · on June 13, 2020

That was my point. :-) It's not clear that hardware bounds checking acceleration is actually a meaningful win.

pjmlp · on June 14, 2020

Sure it is, Intel screw up.

Solaris SPARC, iOS and now Android 11, all make use of some kind of hardware validation in memory accesses.

my123 · on June 14, 2020

MTE today, Morello as the experiment for fully safe C tomorrow: https://developer.arm.com/architectures/cpu-architecture/a-p...

pjmlp · on June 14, 2020

Any help is welcomed. Microsoft is also having a go with Checked C.

The main problem is forcing developers to actually use them, I guess that is why Google has decided to make MTE a requirement for Android 11 on ARM devices.