Are there still issues with agents disconnecting?
Should we not bother and go straight to kubernetes?
Biggest gotcha: tasks restarting over and over because of bad load balancer config on my part (for instance, using 200 as status code when the healthcheck endpoint returns a 302)
Some of what won me over:
* IAM role integration at both instance and task level
* ecs-cli can use docker-compose.yml (with minor revision)
* easy use of spot fleets
* cron support for tasks
* easy to script in control of clusters into your app with AWS SDK
I evaluated Kubernetes, and may give it another look soon, but ECS was pretty easy to get going.
At the time we ran a microservices deployment of ~15 services on ~20 hosts. ECS made orchestrating the services easy for a couple reasons:
Unlike with self managed Kubernetes on AWS we could have high availability with just a simple cluster that just had two machines. Running the Kubernetes control plane high availability requires a lot of setup, and while there are tools like kops that are helping out with setup now its still a lot of extra administration. (See https://kubernetes.io/docs/admin/high-availability/) The advantage of ECS here is that you just start two or three instances in different availability zones that run an agent and that is all it takes to have high availability. You don't have to pay anything extra for the control plane resources, or worry about monitoring it or maintaining it.
Also AWS ECS integrates really well with all the other AWS services. For example metrics from your services automatically get piped to CloudWatch, where you can set up an alarm that triggers a Lambda function, or publishes to an SNS topic that triggers a Pagerduty notificaton. Or you can use the metrics to make a CloudWatch Dashboard for creating a custom overview of your cluster. Logs likewise go to CloudWatch where you can setup triggers that execute a Lambda function. You can give each service its own IAM role to control which resources (DynamoDB tables, S3 buckets, etc) that specific service has access to. ECS integrates really well with Application Load Balancer, which allows you to easily setup a mixed architecture, where some traffic is routed to services that are running as containers under ECS, and other traffic is served by older applications running directly on hosts with no container.
If you are looking for more info as you evaluate whether or not AWS ECS is right for you please check out this list of ECS resources, most of which are created by the developer community: https://github.com/nathanpeck/awesome-ecs
And feel free to reach out using the Twitter handle or email on my profile if you have any questions or feedback on ECS.
And it seems like maybe the ECS team is trying to move a little too fast recently. They released this blog which claims the run-task api supports several new override parameters but the backend still doesn't actually do anything with them it just silently ignores them.