
Scale Testing Docker Swarm to 30,000 Containers - ah3rz
http://blog.docker.com/2015/11/scale-testing-docker-swarm-30000-containers/
======
richardwhiuk
Surely the API Response Time for the 99th percentile should be larger than
that of the 90th?

    
    
      Percentile	API Response Time	Scheduling Delay
      50th	150ms	230ms
      90th	250ms	400ms
      99th	200ms	400ms

~~~
bfirsh
Oops, there was a copy & paste error – sorry about that! Updated with correct
figures.

~~~
meteorfox
Besides that, why are all these response time numbers multiple of 10, are
these rounded up, or is it the precision of the timestamps used to compute
these?

------
nyandaber
The article says the test involved 1000 nodes running 50 containers each, but
the conclusion only talks about "no difference between 1st and 30000th
container". So we can assume things between 30k and 50k containers didn't go
as smoothly? Or did I miss something?

~~~
IanCal
Possibly a copy error? The article says 30 containers on each node in the
specs but 50 in the text.

~~~
shykes
It's a copy error. Initially we tested 30k containers, then later expanded the
test to 50k. In both cases Swarm keeps scheduling without breaking a sweat.

~~~
fpp
Do understand that your target with this setup was to load test the Swarm
manager and 30K is quite impressive.

Did you do / are you planning to do a test with other than the low end
T2.micros to see how much of the API latency might be related to the type of
nodes used.

In other words - if my intention would be to minimise the API latency - how
would you approach this.

Planning to run a Swarm test (with a few 100 nodes) on Digital Ocean where we
have 2 nics per machine - by this we can test latency / response time of the
containers to their external work load plus to the API / Swarm manager on
separate networks & nics.

------
justinsaccount
I still don't quite get the use case for Swarm right now.

It seems to solve the problem of starting up a bunch of containers across a
bunch of hosts, but not anything after that. Maybe the pluggable backends will
enable more of the lifecycle management features like restarting containers
when a host crashes or handling rolling upgrades.

~~~
ThePhysicist
I think Swarm transparently handles networking and volume management for
containers, which can be challenging if your services require multiple
containers that run on different hosts and need to talk to each other.

------
wheaties
When they say "API response time" I can't tell if that's some service API that
rests in Docker itself or if it's the Docker API. Can anyone clarify? I also
haven't had my coffee this morning.

~~~
shykes
It's the Swarm API call. In other words it's "container scheduling time".

------
lbradstreet
How can the 99th percentile latency be lower than the 90th? That must be a
mistake or the result of multiple test runs?

~~~
kbody
Scheduling Delay is different from API response time.

~~~
lbradstreet
The article was updated with the correct results before you saw it.

