I think this is the big downside. You're effectively taking information which will always be needed at the same time, and storing it in two different places.
There is never a need for one piece of information without the other, so why not store it together.
Interesting idea. Effectively moving the extra decode stage in front of the Icache, making the Icache a bit like a CISC trace/microOp cache. On a 512b line you would add 32 bits to mark the instruction boundaries. At which point you start to wonder if there is anything else worth adding that simplifies the later decode chain. And if the roughly 5% adder to Icache size (figuring less than 1/16th since a lot of shared overhead) is worth it.
Which Unicode encoding are you talking about? It sounds a bit like you're talking about UTF-16 conjugate pairs, but that's not how those work. It's not how UTF-8 or UTF-32 work. So, which encoding is this?
If I understand you correctly, the guy I'm responding to is proposing allowing the mixing of different sized instructions. Your suggestion effectively says "I'm starting a run of compressed instructions/I'm finishing a run of compressed instructions" which is a different proposition. Just my take though.
I think this is the big downside. You're effectively taking information which will always be needed at the same time, and storing it in two different places.
There is never a need for one piece of information without the other, so why not store it together.