This is a really good distinction to make - people use the halting problem as an...

nostrademons · on July 8, 2021

Yes, but deciding whether any program can halt is exactly what's called for here. You're operating on x86 machine code, which is Turing-complete, and you don't have the option of restricting to a non-Turing-complete subset of x86 machine code without significantly shrinking your addressable market. (There's already been significant research on creating provably-secure or provably-terminating instruction sets, with some success in specific problem domains, but they tend to fail with general purpose personal computing because consumers won't use your product if it means SimCity won't work.)

Sniffnoy · on July 8, 2021

Your argument can be seen to be invalid because it attempts to apply to any Turing-complete architecture, when in fact the problem is more architecture-specific than that.

In an architecture that used fixed-length instructions and required the program counter to be aligned on them -- or had some other mechanism that disallowed overlapping instructions -- it would be easy to scan for an opcode like this. (Assuming you can distinguish the instructions from the data segment, anyway.) Such a solution doesn't run into any problems with the halting problem or Rice's theorem because it's not attempting to tell you with certainty whether the program will hit that opcode; it just gives you a one-sided test that says, aha, it could hit that opcode (because it's present) and therefore the program is suspicious. A negative result guarantees it won't hit the opcode, a positive result doesn't guarantee anything.

It's only x86's allowing of overlapping instructions that makes this problem hard. I mean, notionally you could scan for such an opcode at every offset, it's just that the false positive rate might be unacceptably high. (Whereas in the fixed-length instruction case, it wouldn't be, because a program using one of these opcodes at all truly is suspicious, regardless of whether it's hit. Although, again, I'm assuming you can somehow filter out the data segment; if you can't, then we have more of a problem.)

Sniffnoy · on July 8, 2021

Oh, wait -- there's a hole in my claim above. Self-modifying code. Well, it works if we also disallow self-modifying code; this still leaves things as Turing-complete and so still shows that your argument cannot be valid.

nostrademons · on July 9, 2021

It's possible to prove that a given x86 binary does not have malicious opcodes. Trivial examples would be a zero-length program, or one containing only NOP instructions. You just scan the program and if the byte sequence representing malicious opcodes never appears, you know that you're in the clear.

That's not the problem being posed here, though. To be useful as a security feature, it needs to be able to work on arbitrary x86 programs, i.e. ones that can contain potentially malicious opcode sequences. And many of these programs will not in fact be malicious, i.e. they never jump into the middle of some other legit opcode to have it interpreted differently. A one-sided test that tells you it could hit that opcode isn't useful if it flags every useful program on the computer.

Sniffnoy · on July 9, 2021

Yes, I don't disagree. My point is that your argument above for that claim is invalid, as your argument is based on purely the Turing-complete nature of x86, while other Turing-complete architectures do not have this problem.