Can you explain at a high level the type of work the second app you mentioned was doing? I know that multithreading actually causes slowdowns if the task each thread isn't substantial enough to offset the context-switching and if the tasks are dependent/interconnected. Are there any other criteria to take into account when considering whether or not to make an app multithreaded?
Pretty much that. It's not just the context switching, it's that it can be tricky to come up with a synchronization strategy that doesn't confound the CPU scheduler's efforts to keep the pipeline full.
It also depends on your environment. That 2nd app I mentioned was running on physical hardware that was running many applications. In that kind of environment, you can end up in a sort of, "double your CPU cores, double your cache misses" situation. And the performance story ends up not just being about one little module; it's about the entire system. There can be a sort of performance prisoner's dilemma, where trying to individually maximize the performance of every single piece in isolation actually results in slower overall performance.