> And today I think we at least need to go to the level of that sub-assembly stuff if one wants to write anything that does not perform badly
You can write fast software today in assembly, C, C++, Rust, and languages like that. You don't need to go to the microcode level. The reason we have trouble with slow, bloated software today is because people don't do that, and instead use slower languages, like Python or JavaScript, using inefficient algorithms, with lots of potentially unnecessary dependencies that optimize for the general case, and not whatever specific problem the programmer wants to solve (which is often less efficient as well).
By sub-assembly level I meant we must have some knowledge at least of pipelining, branch prediction, caching etc at least regardless of your language choice which is to say its not a completely clean abstraction you can totally ignore and forget about. Layers underneath that are probably not so important.
You can write fast software today in assembly, C, C++, Rust, and languages like that. You don't need to go to the microcode level. The reason we have trouble with slow, bloated software today is because people don't do that, and instead use slower languages, like Python or JavaScript, using inefficient algorithms, with lots of potentially unnecessary dependencies that optimize for the general case, and not whatever specific problem the programmer wants to solve (which is often less efficient as well).