
Painless NGINX Ingress - danielmartins
http://danielfm.me/posts/painless-nginx-ingress.html
======
trjordan
I think this blog post is one turn of the crank away from a truth we're all
about to learn: don't hand roll your own Kubernetes ingress.

Dealing with the traffic handling between your users and your code is not a
trivial problem. Like all good ops problems, you can fix it with good tools,
deep knowledge of those tools, fine-grained observability, and smart people
running all that.

This has been the recipe for a couple of really successful SaaS offerings.
Individual servers? Datadog. CDN? Akamai / Fastly.

Disclaimer: I work at one of those companies, Turbine Labs, and we're trying
to make ingress better. Here's a presentation from our CEO on Kubernetes
ingress, and why the specification creates the problems that this blog post is
trying to fix. [https://www.slideshare.net/mobile/MarkMcBride11/beyond-
ingre...](https://www.slideshare.net/mobile/MarkMcBride11/beyond-ingresses-
better-traffic-management-in-kubernetes)

------
odammit
This is a great read. I know the _single cluster for all env_ is something
that is sort of popular but it's always made me uncomfortable for the reasons
stated in the article but also for handling kube upgrades. I'd like to give
upgrades a swing on a staging server ahead of time rather than go straight to
prod or building out a cluster to test an upgrade on.

I tend to keep my staging and prod clusters _identical_ , even names of
services (no prod-web and stage-web, just web).

I'll set them up in different AWS accounts to clearly separate them and the
only difference they have is the DNS name of the cluster and who can access
them.

Edit: I suck at italicizing and grammar.

~~~
web007
+100 to this. Why would any sane Op/Inf/SRE choose not to have at least
account-level isolation - is it only a matter of cost due to under-
utilization?

I prefer to have everything 100% isolated for dev / qa / stage / prod, and
have process and tooling in place to explicitly cross the streams. This comes
from a history of pain with random dev-to-prod (or worse, prod-to-dev) access
and dealing with "real companies" with things like audit requirements.

Having them separate lets you do things like @odammit suggests, upgrade your
cluster in staging without affecting your developers or customers.

If you don't want to go that far, you can set up separate AWS accounts that
are all tied together via an organization, and you can set up IAM roles and
whatnot to share your API keys between accounts. That gives you at least some
isolation, but still lets you GSD the same way as if you have a single
account.

~~~
toomuchtodo
> and you can set up IAM roles and whatnot to share your API keys between
> accounts. That gives you at least some isolation, but still lets you GSD the
> same way as if you have a single account.

Do not do this. You are defeating the purpose of account level separation if
you're sharing API keys between accounts. Each AWS environment should be
totally segregated from the others (cross-account IAM permissions only if you
must), limiting the blast radius in the event of human error or a malicious
actor.

Source: Previously did devops/infra for 6 years, currently doing security

------
hltbra
Cool read. I don't use Kubernetes but I learned a few things from this blog
post that are applicable to my ECS environment.

The NGINX config part is tricky and it didn't come to mind that many programs
will try to be smart about machine resources and it won't work in the
container world as expected. This was a good reminder. OP didn't mention what
Linux distro he's using and what are all of the OS-level configs he changed in
the end of the day; I'd like to see that (was there any config not mentioned
in the post?).

It's awesome that OP had lots of monitoring to guide him through the problem
discovery and experimentation. I need more of this in my ECS setup. I didn't
hop on the Prometheus train yet, by the way.

~~~
danielmartins
> OP didn't mention what Linux distro he's using and what are all of the OS-
> level configs he changed in the end of the day.

I'm using Container Linux, and yes, I did a few modifications, but I
intentionally left them out of the blog post as someone would be tempted to
use them as-is.

I'll share more details in that regard if more people seem interested.

~~~
robszumski
I'd be interested to hear more.

------
hardwaresofton
Shameless plug! The insights in this article are pretty deep but if you're
looking for just a clumsy step 1 to setting up the NGINX ingress controller on
Kubernetes, check out what I wrote:

[https://vadosware.io/post/serving-http-applications-on-
kuber...](https://vadosware.io/post/serving-http-applications-on-kubernetes/)

The most important thing that I found out while working on the NGINX
controller was that you can just jump into it and do some debugging by poking
around at the NGINX configuration that's inside it. There's no insight in
there as deep as what's in this article, but for those that are maybe new to
Kubernetes, hope it's helpful!

------
Thaxll
"Most Linux distributions do not provide an optimal configuration for running
high load web servers out-of-the-box; double-check the values for each kernel
param via sysctl -a."

This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the
settings are good. The thing you don't want to do is start to modify the
network stack by reading random blogs.

~~~
danielmartins
> This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the
> settings are good. The thing you don't want to do is start to modify the
> network stack by reading random blogs.

I agree these are good defaults, but they are not meant to work well for all
kinds of workloads. And yes, if things are working for you they way they are,
that's okay; there's no need to change anything.

On the other hand, I personally don't know _anyone_ who runs production
servers of any kind on top of unmodified Linux distros.

~~~
tinix
> On the other hand, I personally don't know anyone who runs production
> servers of any kind on top of unmodified Linux distros.

You are so, so so lucky... lol. I say that as someone who has come across a
desktop CentOS install on a server on multiple occasions, complete with
running x-org and like 3-4 desktop environments to choose from, along with ALL
of the extras. KDE office apps, Gnome's office apps, etc... HORRIBLE.

------
manigandham
NGINX also has their own ingress controller (in addition to the kubernetes
community version): [https://github.com/nginxinc/kubernetes-
ingress](https://github.com/nginxinc/kubernetes-ingress)

------
ultimoo
Great read!

>> "Let me start by saying that if you are not alerting on accept queue
overflows, well, you should."

Does anyone know how to effectively keep a tab on this on a docker container
running nginx open source? I have an external log/metrics monitoring server
that could alert on this, but I'm asking more on the lines of how to get this
information to the monitoring server.

~~~
zaroth
It sounded like there's a config directive to have Ingress Controller push all
its metrics into Prometheus?

~~~
guslees
If it's helpful at all, here's a concrete example of a k8s nginx setup that
exports-to/is-monitored-by prometheus: [https://github.com/bitnami/kube-
manifests/blob/master/common...](https://github.com/bitnami/kube-
manifests/blob/master/common/nginx-ingress.jsonnet) (Start at
[https://engineering.bitnami.com/articles/an-example-of-
real-...](https://engineering.bitnami.com/articles/an-example-of-real-
kubernetes-bitnami.html) if you would prefer to approach that repo top-down)

------
zaroth
Am I correct in assuming that there is the Kube Service IP routing happening
via iptables DNAT to get the request into the Kube running the Ingress
Controller, and then the Ingress Controller is on top of that routing traffic
to another Service IP which also has to go through the iptables DNAT?

~~~
danielmartins
No. By default, the NGINX ingress controller routes traffic directly to pod
IPs (the Service endpoints):

[https://github.com/kubernetes/ingress/tree/master/controller...](https://github.com/kubernetes/ingress/tree/master/controllers/nginx#why-
endpoints-and-not-services)

~~~
zaroth
Thank you. So there is a DNAT to get to the Ingress Controller but from there
at least it's direct routing to the service endpoint(s)? Does that mean the
Virtual IP given to the Service is basically bypassed when using Ingress
Controller?

TLS termination at the Ingress Controller and by default unencrypted from
there to the service endpoint?

I found this useful: [http://blog.wercker.com/troubleshooting-ingress-
kubernetes](http://blog.wercker.com/troubleshooting-ingress-kubernetes)

Interesting discussion here:
[https://github.com/kubernetes/ingress/issues/257](https://github.com/kubernetes/ingress/issues/257)

It seems like a lot of overhead before even starting to process a request!

~~~
danielmartins
> TLS termination at the Ingress Controller and by default unencrypted from
> there to the service endpoint?

We are doing TLS termination at the ELB (we're running on AWS).

> Interesting discussion here:
> [https://github.com/kubernetes/ingress/issues/257](https://github.com/kubernetes/ingress/issues/257)

Great, thanks!

Regarding ways of updating of the NGINX upstreams without requiring a reload,
I was just made aware of modules like ngx_dynamic_upstream[1]. I'm sure there
are other ways to address this in a less disruptive way than reloading
everything, so this is probably something that could be improved in the
future.

[1]
[https://github.com/cubicdaiya/ngx_dynamic_upstream](https://github.com/cubicdaiya/ngx_dynamic_upstream)

~~~
gtirloni
May I ask how you are automating the ELB/TLS configuration and how that ties
into the Ingress controller? Do you somehow specify which ELB it should use?
We're in a similar situation.

~~~
danielmartins
You can annotate any Service of type LoadBalancer in order to configure
various aspects[1] of the associated ELB, including which ACM-managed
certificate you want to attach to each listener port.

[1]
[https://github.com/kubernetes/kubernetes/blob/master/pkg/clo...](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go)

~~~
gtirloni
Thanks a lot, this will save us quite some time.

------
rjcaricio
Thanks for sharing your experience. I've got great insights to double check in
my current environment.

Could you share which version of NGINX you found the issue with the reloads?
Which version the fix was released?

PS.: I find it interesting/brave that you use a single cluster for several
environments.

~~~
danielmartins
> Could you share which version of NGINX you found the issue with the reloads?
> Which version the fix was released?

I'm using 0.9.0-beta.13. I first reported this issue in a NGINX ingress PR[1],
so the last couple of releases are not suffering from the bug I reported in
the blog post.

> I find it interesting/brave that you use a single cluster for several
> environments.

I'm not working for a big corporation, so dev/staging/prod "environments" are
just three deployment pipelines to the same infrastructure.

As of now, things are running smoothly as they are, but I might as well use
different clusters for each environment in the future.

[1]
[https://github.com/kubernetes/ingress/pull/1088](https://github.com/kubernetes/ingress/pull/1088)

------
tostaki
Great read! Especially the part on ingress class which I didn't know about.
Would you mind sharing some of your grafana dashboards?

------
mindfulmonkey
I still don't really understand the benefit of an Ingress controller versus
just a Service > Nginx Deployment.

~~~
zimbatm
It's the most confusing part of Kubernetes IMO. It's a load-balancer with a
very restricted feature set so what is it good for?

The main issue it tries to solve is how to get traffic from outside of the
cluster to inside. The ingress resource is also supposed to be orthogonal to
the ingress controller so that if your app is deployed on AWS or GCP (in
practice it's not true though).

With the nginx ingress controller the main advantage I see is that you can
share the port 80 on the nodes between multiple Ingress resources.

------
sandGorgon
ingress+overlay network confusion was the reason why we moved from k8s to
Docker Swarmkit.

I still keep hoping for kubernetes kompose
([https://github.com/kubernetes/kompose](https://github.com/kubernetes/kompose))
to bring the simplicity of Docker Swarmkit to k8s.

Or will Docker Infrakit bring creeping sophistication first and eat kuberentes
lunch ?
([https://github.com/docker/infrakit/pull/601](https://github.com/docker/infrakit/pull/601))

------
fulafel
Why does everyone use reverse proxies? It seems complex and inefficient. Why
not serve xhr's and other dynamic content from the app server(s) and static
content from a static webserver?

~~~
philipcristiano
What would you use to provide a single endpoint to multiple instances of an
app server?

~~~
endorphone
There are scenarios where your app servers might be varied as well -- I've
leveraged reverse proxies in front of a PHP application that had parts in .NET
and parts in Go, for instance.

Technologies/competencies change as projects evolve, and being able to
effortlessly reorganized and reroute is so profoundly powerful.

~~~
fulafel
Sure, I'm symphatetic to this kind of "in the trenches" application of reverse
proxies - just not doing it by default.

