Dealing with the traffic handling between your users and your code is not a trivial problem. Like all good ops problems, you can fix it with good tools, deep knowledge of those tools, fine-grained observability, and smart people running all that.
This has been the recipe for a couple of really successful SaaS offerings. Individual servers? Datadog. CDN? Akamai / Fastly.
Disclaimer: I work at one of those companies, Turbine Labs, and we're trying to make ingress better. Here's a presentation from our CEO on Kubernetes ingress, and why the specification creates the problems that this blog post is trying to fix. https://www.slideshare.net/mobile/MarkMcBride11/beyond-ingre...
I tend to keep my staging and prod clusters identical, even names of services (no prod-web and stage-web, just web).
I'll set them up in different AWS accounts to clearly separate them and the only difference they have is the DNS name of the cluster and who can access them.
Edit: I suck at italicizing and grammar.
I prefer to have everything 100% isolated for dev / qa / stage / prod, and have process and tooling in place to explicitly cross the streams. This comes from a history of pain with random dev-to-prod (or worse, prod-to-dev) access and dealing with "real companies" with things like audit requirements.
Having them separate lets you do things like @odammit suggests, upgrade your cluster in staging without affecting your developers or customers.
If you don't want to go that far, you can set up separate AWS accounts that are all tied together via an organization, and you can set up IAM roles and whatnot to share your API keys between accounts. That gives you at least some isolation, but still lets you GSD the same way as if you have a single account.
Do not do this. You are defeating the purpose of account level separation if you're sharing API keys between accounts. Each AWS environment should be totally segregated from the others (cross-account IAM permissions only if you must), limiting the blast radius in the event of human error or a malicious actor.
Source: Previously did devops/infra for 6 years, currently doing security
In our particular case, yes, pretty much. We are a small company with a small development team, so even if I would want to split accounts to different teams, we would end up having one account for 2-3 users, which doesn't make a lot of sense now.
I've been doing patch-level upgrades in-place since the beginning, and never had a problem. For more sensitive upgrades, this is what I do: create a new cluster using based on the current state in order to test the upgrade in a safe environment before applying it to production.
And for even more risky upgrades, I go blue/green-like by creating a new cluster with the same stuff running in it, and gradually shifting traffic to the new cluster.
The NGINX config part is tricky and it didn't come to mind that many programs will try to be smart about machine resources and it won't work in the container world as expected. This was a good reminder. OP didn't mention what Linux distro he's using and what are all of the OS-level configs he changed in the end of the day; I'd like to see that (was there any config not mentioned in the post?).
It's awesome that OP had lots of monitoring to guide him through the problem discovery and experimentation. I need more of this in my ECS setup. I didn't hop on the Prometheus train yet, by the way.
I'm using Container Linux, and yes, I did a few modifications, but I intentionally left them out of the blog post as someone would be tempted to use them as-is.
I'll share more details in that regard if more people seem interested.
The most important thing that I found out while working on the NGINX controller was that you can just jump into it and do some debugging by poking around at the NGINX configuration that's inside it. There's no insight in there as deep as what's in this article, but for those that are maybe new to Kubernetes, hope it's helpful!
This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the settings are good. The thing you don't want to do is start to modify the network stack by reading random blogs.
I agree these are good defaults, but they are not meant to work well for all kinds of workloads. And yes, if things are working for you they way they are, that's okay; there's no need to change anything.
On the other hand, I personally don't know anyone who runs production servers of any kind on top of unmodified Linux distros.
You are so, so so lucky... lol. I say that as someone who has come across a desktop CentOS install on a server on multiple occasions, complete with running x-org and like 3-4 desktop environments to choose from, along with ALL of the extras. KDE office apps, Gnome's office apps, etc... HORRIBLE.
Really? The distributions might work for the average site but high-load always requires tuning from the defaults on even the latest distros.
>> "Let me start by saying that if you are not alerting on accept queue overflows, well, you should."
Does anyone know how to effectively keep a tab on this on a docker container running nginx open source? I have an external log/metrics monitoring server that could alert on this, but I'm asking more on the lines of how to get this information to the monitoring server.
TLS termination at the Ingress Controller and by default unencrypted from there to the service endpoint?
I found this useful: http://blog.wercker.com/troubleshooting-ingress-kubernetes
Interesting discussion here: https://github.com/kubernetes/ingress/issues/257
It seems like a lot of overhead before even starting to process a request!
We are doing TLS termination at the ELB (we're running on AWS).
> Interesting discussion here: https://github.com/kubernetes/ingress/issues/257
Regarding ways of updating of the NGINX upstreams without requiring a reload, I was just made aware of modules like ngx_dynamic_upstream. I'm sure there are other ways to address this in a less disruptive way than reloading everything, so this is probably something that could be improved in the future.
Could you share which version of NGINX you found the issue with the reloads? Which version the fix was released?
PS.: I find it interesting/brave that you use a single cluster for several environments.
I'm using 0.9.0-beta.13. I first reported this issue in a NGINX ingress PR, so the last couple of releases are not suffering from the bug I reported in the blog post.
> I find it interesting/brave that you use a single cluster for several environments.
I'm not working for a big corporation, so dev/staging/prod "environments" are just three deployment pipelines to the same infrastructure.
As of now, things are running smoothly as they are, but I might as well use different clusters for each environment in the future.
The main issue it tries to solve is how to get traffic from outside of the cluster to inside. The ingress resource is also supposed to be orthogonal to the ingress controller so that if your app is deployed on AWS or GCP (in practice it's not true though).
With the nginx ingress controller the main advantage I see is that you can share the port 80 on the nodes between multiple Ingress resources.
I still keep hoping for kubernetes kompose (https://github.com/kubernetes/kompose) to bring the simplicity of Docker Swarmkit to k8s.
Or will Docker Infrakit bring creeping sophistication first and eat kuberentes lunch ? (https://github.com/docker/infrakit/pull/601)
All of which could be done at the app server level sure, but then that would shift that complexity to your app and your developers.
Oh and job security, obviously.
Kubernetes runs a cluster of machines that act like a mini internet, with many containers running many apps. These apps communicate with each other across containers and machines through a series of proxies so that apps only have to worry about a single address or service name. Kubernetes does allow for headless services which will publish all of the pod IPs under a DNS name if you want but this is usually not the common scenario.
Beyond just knowing endpoints, apps may need to worry about healthchecking, failover, load balancing, rate limiting, security, observability, routing decisions and more. It's far simpler to consolidate all this functionality rather than leaving it to every single app to implement it all over again.
An ingress controller is in charge of running a specific proxy that deals with traffic into and out of the cluster rather than within it. There are several implementations other than nginx and they all require various levels of tuning to fit the needs of the cluster, but it's an optimal solution since you might not want or have access to control traffic on the other side.
Technologies/competencies change as projects evolve, and being able to effortlessly reorganized and reroute is so profoundly powerful.
Round-robin DNS might also work or complement this.
Almost everything on the internet is behind layers of proxies, it's not a bad thing and isn't much cause for concern.
It's a proven design rule (the end-to-end principle) to prefer the smarts at the edges of your system, and the problems stemming from the reverse proxy described in the article, in my book, counts as further evidence for this idea.
That's exactly what reverse proxies do - leaving the internal apps free to just serve requests instead of worrying about the perimeter.
The problems described in this article have nothing to do with reverse proxies but rather the ingress controller and config settings.