How is this done in detail? In the case of Flask does it create new processes or threads? How quickly does it create a new Flask process/thread to serve a new request?
EDIT: This website documents configuration settings for autoscaling: http://uwsgi-docs.readthedocs.org/en/latest/Cheaper.html - looks like it's based on various heuristics and algorithms, and in practice it's too slow to create a new process for each request, thus wasting resources
I would think that ideally the number of processes/threads would exactly match the number of live requests, otherwise you're either having too few instances to handle all the requests as fast as possible or having too many instances which means that more resources are used than necessary, thus increasing costs.
I think Node.js is an improvement because the architecture is better suited for scaling and all the dependencies are asynchronous by default. Of course, I think there are other reasons for Node.js (e.g. isomorphism, easier full-stack development etc.), but I guess that's another discussion, really.
In terms of the ideal number of processes/threads, I'm not so sure number if live visitors is correct. It's going to depend on your various resource constraints but if requests are quick to service there's no problem with having other requests waiting in a queue.
I once was called in to fight a fire where the team said, "we don't understand, we have horizontal scalability, but adding more machines seems to be making things run slower!" - Erm, yeah, because your single db server is on its knees :-)
Edit to add that I've just looked through the cheaper docs you mentioned and that behaviour all looks pretty sane. You can have a good default number of workers and expand as required. Also, uWSGI forks workers after the app is loaded in memory so you get fast copy-on-write behaviour during worker start.
Of course, this doesn't change your app and its dependencies to work in an async-friendly way, but then again, Node.js callbacks are hardly transparent either.