The 43-byte implementation might define only a subset of the functionality provided by the full VM, enough to "bootstrap" into the full implementation, most likely.
In fact, if the VM is Turing complete, it can theoretically emulate any computation, including its full implementation, even from a small subset of operations.
The point is that the 43-byte implementation does not need to encode the entire VM explicitly. For example, if the VM has built-in primitives for looping, branching, and memory management, the minimal implementation can leverage these to rebuild the remaining functionality.
"For example, its metacircular evaluator is 232 bits. If we use the 8-bit version of the interpreter (the capital Blc one) which uses a true binary wire format, then we can get a sense of just how small the programs targeting this virtual machine can be."
In fact, if the VM is Turing complete, it can theoretically emulate any computation, including its full implementation, even from a small subset of operations.
The point is that the 43-byte implementation does not need to encode the entire VM explicitly. For example, if the VM has built-in primitives for looping, branching, and memory management, the minimal implementation can leverage these to rebuild the remaining functionality.