Depends on application type and device.
I wouldn't hold my breath, e.g. in games - one of the most performance-critical apps - the best idea so far is to spin one thread for physics, one thread per AI, one thread per game logic etc.
Here is a good blog post describing how sad the things is: http://bitsquid.blogspot.co.uk/2015/03/multithreaded-gamepla...
IMHO, part of that is that you don't want more threads than processor cores (unless they're IO bound). If we had lots lots more cores, then I would expect to see physics, AI, game logic etc use more than one thread. There is little reason to pay for the additional complexity, when most mainstream computers don't have more than maybe 4-8 cores. Obviously not everything parallelises well, so the full extent that this is possible remains to be seen.
I would rather give each type of computation a chance to use all cores: parallelism rather than concurrency. But there is a lot of legacy code out there.
I agree: one thread pool for all tasks (I'm personally a fan of Intel's Threading Building Blocks, so...)
However, if you only target, say, 4 cores and you have 4 subsystems (physics, AI, rendering and gameplay lets say) and all four constantly use the CPU (ie you don't have too much downtime) then the extra complexity and overhead (moving data between cores, shared data, locking) may not be worth it over just pinning each subsystem to its own core.