Here's my question, being relatively ignorant of Docker's internals: _is it possible_ to improve that docker create/docker start time from 300-400 ms (all in) to <100ms? 300-400ms is kind of a lot of latency for a cold boot still, and people still do things like keepalive pings to keep functions warm, so it would be pretty great to bring that down some more.
I take the view that there is a "warmth hierarchy" (an idea I spitballed with other folks in Knative before Matt Moore took it and filled it out a lot).
You have a request or message. You need to process it. The hierarchy runs CPU > Memory > Disk > Cold.
CPU-warm: immediately schedulable by the OS. Order of nanoseconds.
Memory-warm: in-memory but can't be immediately scheduled. Our reference case is processes that have been cgroup frozen. Order of tens of milliseconds.
Disk-warm: the necessary layers are all present on the node which is assigned to launch the process. Order of hundreds of milliseconds.
Cold start: nothing is present on the node which is assigned to launch the process. It will need to fetch some or all layers from a registry. Order of seconds to minutes.
Some of these problems can't be solved by Kubernetes (it doesn't know how to freeze a process), some of them can't be solved by Docker (it doesn't know about other nodes that might have warmer disk caches). Then there's the business of deciding where to route traffic, how to prewarm caches and based on what predictive approach, avoiding siloisation of processes due to feedback loops.
As these are knocked down they will pay dividends for everyone.
The main things are optimizations around pre-allocating network namespaces, we've built some experimental code to skirt around this https://github.com/fnproject/fn/blob/master/api/agent/driver... (which can be configured & tested) and we're testing approaches like this to attempt to get start times down to as low as we can. <100ms still seems like a lofty goal at this point, but this is becoming a pain point for various FaaS platforms that are container based and some optimization work should come out of all this. A few layers up, it's a little easier to speed things when throughput increases for a given function, but devs do seem to want the 0->1 case to be almost as fast as the 'hot' case so we FaaS people need to answer this better in general, it isn't something we want users to have to think about!