Talk to Cliff Click (or sift through his blog http://www.azulsystems.com/blogs/cliff) and he will quickly disabuse you of the notion that having bytecode instructions in a processor is a good idea. For optimal performance you want a generic RISC-type processor with good hardware performance for critical things like read-barriers that the bytecode might require; you really don't want to design a processor that directly executes Java bytecode.
There's a whole hell of a lot of optimization that gets done in the JIT layer besides just translating bytecode to machine code, and you want all that stuff to get done in software rather than trying to build it into your processor. Once you've done all that stuff, emitting actual assembly code isn't really the hard part, so you might as well just design a processor that you can make fast, give it a simple general-purpose instruction set, target that instruction set in your JIT, and then add a few special goodies as you need them for things that are really, really hard to do fast without specialized hardware support.
There's a whole hell of a lot of optimization that gets done in the JIT layer besides just translating bytecode to machine code, and you want all that stuff to get done in software rather than trying to build it into your processor. Once you've done all that stuff, emitting actual assembly code isn't really the hard part, so you might as well just design a processor that you can make fast, give it a simple general-purpose instruction set, target that instruction set in your JIT, and then add a few special goodies as you need them for things that are really, really hard to do fast without specialized hardware support.