I was thinking of the T4 and I had my dates wrong. T4 does weird dynamic threading internally. So it can be Out of Order and In Order at the same time. Depending on the number of execution units you need.
Furthermore you illistrate that Intel was again ahead. Having 6 core scalar galaxy hiding behind a virtual processor. You can just change that to a 3 and low-and-behold 50% less power consumption!