Hacker News new | past | comments | ask | show | jobs | submit login
Avoiding the top Nginx configuration mistakes (nginx.com)
417 points by anotherevan on Feb 22, 2022 | hide | past | favorite | 106 comments



Nearly all of these are approximately "users did something they expected to work, but it didnt; shame on them" or "We chose a safe but mostly bad default, shame on users for not realizing it".

I understand the rational for not changing old defaults in fear of breaking existing configs. However, maybe a release or two of deprecation or better optimized defaults under new names or a new (kinda gross but...) strict mode which is opt-in but is the new default for new instalations.

1. Warn the user on startup of potential misconfiguration 2. Allow error_log off to work, as users are expecting it to 3. Enable keep alive by default 4. Acknowledge the confusing syntax and find a better solution than blaming users 5. Warn the user on startup of impact of proxy_buffering off 6. See 4 7. See 1 8. See 1 9. Fix the proxy hash algorithm so that it hashes the whole address (why only first 3 octets?) or give it a new name; see also 1 10. Detect possible optimizations on startup and implement automatically; why ask the user to config something that can potentially be detected automatically?


This x1000. So many of those are terrible default behavior. They admit as much when they say things like "doesn't work how you'd expect".

I get that NGINX is powerful. But I hate dealing with it. Every. Single. Time.


Nginx rose out of the cesspit that was Apache config files. For me it was night and day better and I love the fact Nginx is just set and forget.

That's not to say it can't improve...


The worst part is that these problems only manifest under high load. Unless a team does performance testing (for long periods of time? in a highly-production-like environment), they might never know of many of these issues.


Cynical view: a great way to sell consultancy services / their Plus product?


Less cynical view: the defaults are designed to work for a majority of use cases and the majority of nginx users aren't running high traffic servers.


It doesn't seem the high traffic settings would adversely affect low traffic instances, which is why these defaults smell so bad.


"Doesn't work how you'd expect" reminds me of the time that we had users receiving responses from other people's requests.

Why? Our API is idempotent, but we needed to use POST instead of GET to let the browser send a request body. Our API is very slow, so we put NGINX in front of our API to cache responses. We used something like proxy_cache_key "$request_uri|$request_body"; I don't think I was completely remiss in thinking that $request_body means "request body". Testing showed that it worked just fine. In production, some users made requests that were larger than our tests - and then $request_body is empty, and users will get each others' responses. This behavior is helpfully documented as follows:

"The variable’s value is made available in locations processed by the proxy_pass, fastcgi_pass, uwsgi_pass, and scgi_pass directives when the request body was read to a memory buffer."

Stackoverflow is more helpful: https://stackoverflow.com/questions/18795701/nginx-proxy-cac...


Alternatively the config can be versioned when a single directive at top is sufficient to have saner defaults.


This. I would love to be able to specify a profile in almost every kind of file. Start with, say,

  profile base2022
and upgrade the defaults yearly. Provide a page detaillibg the differences so users can review what they have to do to up their baseline. Existing users can upgrade at their own pace, new users start with the best possible configuration.


You could do this in a cooperative manner: produce it as a set of include files, put it on GitHub or whatever, and spread the word.


Yes, this is much better than a "normal vs. strict" directive in any conceivable way.

It also helps on slowly deprecating the older behavior, you just say "nginx2004 is now deprecated", and people understands what you mean.


Absolutely. If you (nginx) know how your product should be configured, you should configure it that way out of the box. Don’t blame users.

And if backward compatibility is important, take a leaf out of Rust’s book and make a nginx-config-version 2; directive or something.

With this set, you can safely use the new defaults. Use it across all example configurations on the website and in distribution packages.


It is interesting to see nginx commit the same mistakes that led Apache to its demise.


> However, maybe a release or two of deprecation or better optimized defaults under new names or a new (kinda gross but...) strict mode which is opt-in but is the new default for new instalations.

Something like this or a \version "2.12.0" command would be massively useful. The alternative is a sea of undisclosed footguns that are blindingly obvious once you know they exist but otherwise are well-camouflaged. Ok, the other alternative is for every future developer to brainlessly spam a long incantation of sensible values that they swear they'll learn about later just so a Fortune 500 early-adopter company can have smoother upgrades.


Ah, proxy_buffering. I like nginx but that setting always bites me.


My biggest gripe with nginx is the documentation. I ran into an issue with proxy_buffering being off yet I was still getting 502 errors because the upstream was passing too large headers for the default 4k buffers. I naively assumed that with proxy buffering off that it would send data to the client immediately and, y'know, not buffer it, as [0] suggests. Turns out that turning proxy_buffering off doesn't turn proxy buffering completely off.

While working on that issue, I also needed to set the proxy buffer size. Quick, what's the difference between proxy_buffer_size [1] and the size argument of proxy_buffers [2]? I still don't know, because the docs for each directive sound to me like they're restating the same exact thing.

[0] https://nginx.org/en/docs/http/ngx_http_proxy_module.html#pr... [1] https://nginx.org/en/docs/http/ngx_http_proxy_module.html#pr... [2] https://nginx.org/en/docs/http/ngx_http_proxy_module.html#pr...


Looking at that documentation has me even more confused.

The article says "Proxy buffering means that NGINX stores the response from a server in internal buffers as it comes in, and doesn’t start sending data to the client until the entire response is buffered."

But those parameters say nothing about waiting to send data to the client, and in fact some of them like proxy_busy_buffers_size imply the exact opposite.


Logically the purpose of proxy buffering is to prevent server death from clients holding connections open by reading the data very slowly (slowloris attack). But this doesn't require waiting for the entire upstream response before sending bytes to the client; merely that reading from upstream doesn't stop when the client is reading slowly. It looks like nginx does this while still preventing slowloris.

The parameters mentioned look like they're about tuning the relative memory costs of sending down the first bit of data in a response, which may contain enough information on the client side to do something interesting. proxy_buffer_size looks like it's just the first buffer, while the size argument to proxy_buffers applies to all the buffers.


But it seems it's about response buffering, not incoming (request) buffering. (Or to be more precise slowloris can be done on either/both leg(s) of the request-response process.)


I'm surprised the "set a resolver with a dns ttl" isn't on here; it's probably my only headache with nginx. If you have an upstream that you specify with a hostname, nginx will lookup the host, cache it, and never refresh it by default.

I don't know why it happens to me so often, but I will set up nginx containers that proxy to some load balanced service (via DNS), and then one day the IPs will change and that service would break because I forgot to set resolver.


This issue has bitten me multiple times. You must set the resolver AND use the domain name in a variable. See here: https://www.nginx.com/blog/dns-service-discovery-nginx-plus/ under "Setting the Domain Name in a Variable". The other options ignore the record TTL or require Nginx Plus.

More details here: https://github.com/DmitryFillo/nginx-proxy-pitfalls


This x1000. This is why it always bites me; I know there's some magic incantation, but I always forget the one or the other; and you never notice until it causes downtime. The git repo is a huge help.


Hey, thanks for writing Caddy's advanced reverse proxy code years ago. I still used it as a model for Caddy 2's new reverse proxy.


Caddy has come a long way since then! When my previous company switched to Caddy this issue was actually one of the motivating factors to adopting Caddy at the time (and I'm pretty sure Caddy 2 is in use today). I still mainly use Caddy over nginx (except where I'm running a no-frills debian system).

This post actually means a lot because Caddy became so popular that (1) I assumed whatever I had written had been completely scrapped and forgotten and (2) no one would ever believe me that I made a big contribution "before Caddy was popular". It's one of the projects I'm so happy to see has come so far.


Ha, that's cool. Yeah, we still have the basic for loop that does upstream selection and load balancing, with the same timeout configs, etc. It's come a long way since then like you said, but I still consider Caddy 2's reverse proxy a derivative of your contribution from, like, week 3 of Caddy v1.


NGINX Plus? That’s a thing? Sheesh. what a business model.


This is my top issue with open-source nginx. Nginx plus has the solution though with its `resolver` directive. https://docs.nginx.com/nginx/admin-guide/load-balancer/http-...


Opensource nginx has the same resolver directive.


But servers in upstream block don't have a 'resolve' setting in non-plus versions.

'resolver' still does something in the open source version, but requires some configuration contortions.


nginx config increasingly feels like a bunch of foot guns waiting to go off for me.

For smaller projects, I've really enjoyed Caddy's "best practice by default" approach to webserver/loadbalancer config where you can generally only choose to remove best practices, not try to build them up from scratch. Has SSL and LetsEncrypt enabled by default, etc. This avoids a lot of the pitfalls of incorrect webserver config altogether, keeps the configuration file really small and easy to read. I wish this style of config was more common.

> https://caddyserver.com


I replaced 846 lines of Apache config files (or 400 lines of nginx config files) with a 29-line Caddyfile. I was worried about switching over but honestly it is really refreshing to not worry about the reverse proxy at all, unless I'm making a unique change.

Sure, LOC is not a great indicator of size, but here's a simple comparison: https://pastebin.com/kmXMX6Yq

And this doesn't include the pain of manually refreshing SSL certificates, certbot always managed to break on me and I could never get auto-SSL to work.


I would guess nginx config is suffering greatly from the requirements of backwards compatibility. Would be really interesting to see what a fresh attempt would come up with (although reaching feature and performance parity would take a really long time).


I've been using nginx for years because "everyone knows" (is it even true anymore?) that it's more performant than apache, and I use a single dedicated server that is rather low-end, because the applications hosted on it are personal/hobby in nature and I don't need AWS/Azure/GCP scalability.

How does caddy compare, performance-wise, to apache and nginx?


Why bother with performance? Does the performance of the http proxy really matter to any real world application?

Serving http is usually completely I/O bound. The more efficient parser in nginx never mattered in the real world. It was used for the much more readable configuration and the fewer bundled foot guns.

What matters for performance is mostly your tcp settings, your buffer sizes, and that disk logging. And that's only in the few cases where the upstream application isn't the bottleneck.


>Why bother with performance?

The common wisdom when I last looked into this was that apache uses more CPU, especially under load. Naturally, I want to use the httpd that's most efficient on CPU since it's a finite resource, I'm running several different applications on that box, and you never know when one of your side projects is going to meet the HN hug of death.


> The common wisdom when I last looked into this was that apache uses more CPU,

It can be hard to separate common wisdom from nonsense on the Internet. It's easy to test yourself. There are plenty of easy to use http benchmarks out there, including wrk and ab. Just be wary that the benchmark tool itself will use more cpu than the http server. Which on the other hand might tell you everything you need to know.

If you want to spend your cpu cycles more wisely, you should tune your software and network stack to your request and response size, carefully consider caching and pipelining, not mindlessly switch software. For certain applications this will even make sense.

Of course this has zero relevance if your application spits out a megabyte of content with every request.


At your scale they all have the same performance.


is it even true anymore?

Not since Apache 2.4 and the latest APR libraries that Apache depends on. Some Linux distributions kept the 2.2 branch for a while and that may be one of the reasons people still associate Apache with slowness.


I actually really like Caddy, though some of its defaults at least historically have been odd, such as responding with 200 where other web servers would respond with a 502 or something similar: https://caddy.community/t/why-does-caddy-return-an-empty-200...

Also, this is a bit of a personal preference, but v1 Caddy felt like maybe a bit easier to get started with than v2 Caddy, though sadly was abandoned. I'm sure that there were good reasons for doing it, but the few forks that started out from it with the intent to maintain it never really went everywhere, so v2 is the only possibility nowadays, unless you want to maintain it yourself or like dead software: https://github.com/WedgeServer/wedge

Of course, that's not a criticism of Caddy itself, just how we as an industry sometimes need rewrites and that makes us keep up with the churn.


The fact of the matter was that _servers aren't actually easy_. Caddy v1 tried to give the illusion that it is, but it just caused more problems, really. The complete rewrite for Caddy v2 was necessary to make Caddy flexible enough to solve like half or more of the open issues the Caddy repo had at that point. Flexibility and simplicity are often at odds. But Caddy tries to keep smart defaults in general.

Also, any time someone brings up the malicious fork "wedge", it saddens me. That was done at literally the most stressful moment in the project's history, and it only made things worse for everyone. I really hope people learn to chill out and give maintainers a bit of time to breathe and respond instead of taking such hostile measures like that. (You can probably pretty easily dig up the Github issue with the discussion if you really care to see what went down. Sigh.)


> The fact of the matter was that _servers aren't actually easy_. Caddy v1 tried to give the illusion that it is, but it just caused more problems, really.

You know, as someone who used Caddy v1 pretty extensively (still probably have it running on some of my homelab boxes), i never really ran into those supposed problems. Maybe they'd manifest in more complex configurations, but as a reverse proxy, file host, web server that's integrated with PHP or even something to allow setting up basicauth, i never found it to be lacking.

That's not to say that Caddy v2 is bad, just that someone for whom v1 worked perfectly might find it a bit cumbersome to move to the newer version, as the old one is no longer supported. Of course, you can say the same about JDK 8 vs JDK 11+ etc.

> Also, any time someone brings up the malicious fork "wedge", it saddens me.

If i recall correctly, it just took the project and rebranded it, which isn't necessarily malicious (for example, aggregating and selling users' data would). That's the nature of open source - anyone who wants to do that, can.

Of course, i think that the fork is also irrelevant because they couldn't actually be bothered to maintain the fork and nobody cared for it, much like how other projects, like Materialize.css, ended up.

For example, here's the original: https://github.com/Dogfalo/materialize

Then someone got upset that the project was abandoned (even though they were taking folks Patreon money): https://github.com/Dogfalo/materialize/issues/6615

They created a fork of their own: https://github.com/materializecss/materialize

Which promptly died down because there just wasn't enough interest in maintaining it. That's just how things go sometime.


It seems caddyserver makes certificates a lot easier to work with. Certbot is pretty much on par with NGINX in terms of challenging documentation and foot guns.


The problem is best practices change over time.


> The if directive is tricky to use, especially in location{} blocks. It often doesn’t do what you expect and can even cause segfaults.

That's not user error. That's lazy programming and a bug. If your code segfaults because of user input you weren't expecting, that's on you as a programmer.


It wouldn't surprise me if they consider that WONTFIX given they wrote a whole wiki page entitled "If is Evil": https://www.nginx.com/resources/wiki/start/topics/depth/ifis...


I can understand why certain directives don't work inside an if, even if segfault is dumb.

Why are the empty if blocks breaking things outside of them?


Oh I know. Hardly inspires confidence; all of those examples are clear bugs declared features because of poor programming practices.


Nginx would benefit from a simpler higher level instruction set or an official 2-5 recipes as I'm sure 99% of the implementations doing a few basic things and dont need the ability to screw it up.


Definitely. Been using it for years for many projects and it’s always the same needs: Let’s encrypt + static files + upstream reverse proxy. I had no idea about keepalive not being default - why isn’t this in all the examples and tutorials?


I'm sure there' space in the tutorial area. A general point for $tech:

If any reader is thinking of writing a tutorial, do check if it's been done before. Likely a grossly simplified example of how to do X while the person reading it has read upteen tutorials on how to do X, Y and Z but still doesn't know where to keep configuration files, or how anything interacts with anything else.

Instead, think of writing Series Of Tutorials That Together Build Moderately Fully Fledged Thing Including Satellite Stuff Like, in this case, Nginx, or SMS, or why XX characters for an email address string, or whatever of the plethora of things to Make It Work are needed.

Some books do this. Not a huge number though, certainly not compared to hit-and-run tutorials.

Call them books. Call them Tutorial Super Series. I don't think the little more time invested would not generate outsized returns.


For that I use linuxserver's SWAG container.

Let's Encrypt automates cert renewal, and SWAG automates having to remember how in the heck to set up LE.


Nginx config seems like a bottomless pit. It gives me the painful feeling that no matter what I have done with it, there's some security or performance issue still there.

Does this mean I should just stick with a more managed solution? Or is this a common feeling that I should just learn to live with? For my fun projects I use Caddy with mostly default settings.


I wouldn't be that worried about it. Sure there are ways to optimize it more and squeeze every drop of juice out the grape, but most of the time you're gonna be just fine. It might be the difference between running an extra app server or not.

Now that said nginx website has a ton of great examples. You can usually copy/paste them into whatever you need pretty easily, and once it's done it's not likely to change. I would especially do this if nginx is your internet-facing load balancer and has multiple upstreams behind it. Plenty of good examples out there.


This article briefly mentions a very useful analysis tool for NGINX configuration: Gixy.

It looks for the following misconfigurations[0]:

  - [ssrf] Server Side Request Forgery
  - [http_splitting] HTTP Splitting
  - [origins] Problems with referrer/origin validation
  - [add_header_redefinition] Redefining of response headers by "add_header" directive
  - [host_spoofing] Request's Host header forgery
  - [valid_referers] none in valid_referers
  - [add_header_multiline] Multiline response headers
  - [alias_traversal] Path traversal via misconfigured alias
The alias traversal gotcha is one of the most pernicious I've seen. A single, seemingly innocuous '/' is the difference between a path traversal vulnerability or not.

[0]: https://github.com/yandex/gixy#what-it-can-do


As an Apache user for relatively simple websites (wordpress, laravel, not high-load web-apps), under moderate load, should I ever migrate to NGINX? I love how easy Apache is to configure. I'm using the event module with PHP-FPM, would I even notice a performance improvement?


Apache is fine. No reason to switch to nginx unless there is something nginx offers that apache doesn't.

In earlier days a lot of people was under the impression that nginx was much faster for PHP, but that was not true. Apache with mod_php was faster, but the issue was memory consumption for non php files as the php module was loaded on every request, even if it wasn't php.

When using php-fpm you don't need the php module on every request as php-fpm is contacted as needed. This does increase latency and decrease performance for php requests, but the difference is indistinguishable for most people.

Slowloris attacks was/is one issue with non event-based webservers.

Both nginx and apache can handle a six digit number of php requests each second on a low end server. You application(Wordpress, Laravel) will without a doubt be a bottleneck before the webserver becomes the bottleneck.


I remember one of the early "selling points" of nginx was it didn't traverse the directory tree for a .htaccess file to dynamically load config from, for each request, when that was specifically a "selling point" of Apache, which is very useful for shared hosting contexts when users don't have access to change the webserver config.


Considering that that's something you can choose to do with apache isn't a selling point for nginx. It's an optional feature nginx lacks.

But yeah, there was a lot of weird and misinformed reasons for why people recommended others to use nginx no matter what the use case was.


Most of the benchmarks I'd seen showing Nginx as significantly faster than Apache boiled down to configuration. Having mod_rewrite turned on with AllowOverride for Apache, for example, slows it down quite a lot.


Sometime ago there was a big push "Apache is old fashioned and clunky! Nginx is sooooo powerful! Move to Nginx!!"

Now we are seeing the opposite backlash.

As someone who doesnt know much about this topic, I also wish there was an easy answer :/


I'm just a hobbyist, so take this as you will, but my experience with NGINX is bad. Sure it might be fast (does that even matter at normal scales?), but it configuration is a pain in the neck, as this article conveys. Common things seemed hard and fragile to me, like getting the reverse proxy set up with Let's Encrypt. As it's just a thin layer on what I am actually doing, spending days or weeks learning the proper way to do things felt like an imposition on my time. At some point I broke my config and gave up on it – I moved my whole site to a new VPS without a reverse proxy and encrypted through Cloudflare.

I've since moved to Caddy, and uh, wow, this is what it's supposed to be like!


If you need to proxy the requests to your Apache server(s), then sure, Nginx is really nice. It's got proper HTTP/2 support unlike Apache, and it also supports proxying mail and other connections if needed.

There's little reason to replace Apache completely, though.


It’s always worth learning NGINX, as it is a truly powerful tools, but it’s often PITA to configure, and there are a lot of pitfalls if you use virtualhosts.


Linux distributions of nginx sometimes have their conf.d/ and available-sites.d/ directories in /etc/nginx/ which sometimes can confuse matters.

With software packages such as nginx (in general, not in the context of Linux distributions mentioned above), it's sometimes easier / clearer to start with a completely empty configuration file, and add just enough to make it work (TDD-style), and / or stop it complaining. If you can explain / defend / justify the presence of every line, you stand a much better chance of knowing how it will behave.


Slowloris attacks are the only reason I've switched from Apache to nginx. AFAIK Apache using event/worker still has no protection.


Perhaps some benchmark would convince one.


Traefik has been my go-to reverse proxy for quite a while now. The main advantages is its docker/k8s and native ACME support. You can configure services, config, routes, etc. via docker labels. When running under compose, this makes it very clean and convenient to maintain config. It also has plenty of useful middlewares and has a prometheus exporter for its internal stats.


It's good we have many options to choose from.

Apache, nginx, caddy, Traefik, HAProxy, lighttpd, Varnish, Envoy are likely to be the major ones?


One we see a lot is this: https://trac.nginx.org/nginx/ticket/1151

    worker_processes auto;
in a kubernetes environment where the nginx container only has 1 cpu share, but is running on a 64 core node...this does the wrong thing and launches dozens of workers, which do nothing but eat memory. According to the bug above they've not fixed it because they can't tell the difference between this case and other cases where your cpus may be restricted. While that's true, other systems don't just ignore the problem: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-Janu...

(we also commonly see issues with buffering, and proxy host header not being set; the latter is an issue when code moves from an environment with physical hosts - which don't care about the host header - to virtual ones)


Another big mistake with defaults is that you need an explicit `server` block with `default_server` for all the `listen` directives you have.

Otherwise if you have a listen directive for some host that doesn't have a default, this server will be used as the default for this particular listen directive.

For example doing `listen 80` and then requesting any non matching host will still return this server.


Isn't it ironic that even the writers of this article don't follow these rules in the examples on that page?

There is "Mistake 3: Not Enabling Keepalive Connections to Upstream Servers", where it says:

    In the location{} block that forwards requests to an upstream group, include the following directives along with the proxy_pass directive:

    proxy_http_version 1.1;
    proxy_set_header   "Connection" "";
Then, at "Mistake 10: Not Taking Advantage of Upstream Groups", the "location" block does not include these two directives, even though the "upstream" block contains a "keepalive" directive.


There are some issues specific to nginx, and probably not the most common errors made. It's fundamentally hard to manipulate http requests without being knowing all forms paths and headers can take.

Careless use of regexps in http routing is a common source of problems. Things like uri encoding, parameters as subdirectories and control characters is easy to get wrong. Treating user supplied headers as trusted is also easy to get wrong. It's not uncommon to see configurations which are remotely exploitable, which is another level of bad from what is described in the article.


What are some alternatives to NGINX? Is there a maintained fork with a better config?

I know of Apache, but that seems to be falling out of favour lately.


I like Caddy: https://caddyserver.com/ Nice community and enough features for my (basic) use cases.

The only thing that I don’t quite like is the logging format.


The automatic certificate renewal is a killer feature. Goodbye dodgy cronjobs!


Just responded on alternatives on another post.

https://news.ycombinator.com/item?id=30440028

I'm starting to think Apache isn't bad once again after pulling several hairs out of nginx config complications over the last 10 years.

If I just wanted to alias system path of '/some/path/' as 'https://domain/another_path/' in the URL, I don't know how to do that... Is it 'alias' or 'root' and should it have a slash at the end or not??? I usually just give up and symlink the directory on the system.

Same with proxy_pass on ending slash that is confusing.

And the priority of 'location' block is a nightmare until you Google enough to figure it out every time I configure nginx, not to mention the cryptic convention of ~ or ~^ and the likes and the fact you can't effectively nest location blocks.


> I know of Apache, but that seems to be falling out of favour lately.

lately? more like for the last 10 years afaict.


HAProxy is one and I'd like to see a breakdown of how these config issues either spillover or are dealt with differently in that environment.


Thanks so much for replying! Just looked at some example configs and it looks way slicker than nginx.


There's an overlap in features between Nginx and HAProxy, but one isn't necessarily a replacement for the other. Nginx is a full webserver, HAProxy isn't a webserver.


Of late (~1 year) I've migrated to AWS ALB from Nginx and recommend it if you are on AWS ecosystem.

Being a Layer-7 load balancer, ALB is (almost) feature compatible with Nginx. You can do route based, header based load balancing, terminate SSL and so on.

For an internet facing app my preferred setup now a days is [CloudFront → ALB → application].


I remember someone posting an article of common nginx config security mistakes but can't find it. Does anyone remember? I would like to read through that again as well.


Not a comprehensive article, but the worst security footgun, by far (IMO), is $uri. It’s completely unsafe to use $uri in basically any directives! You cannot redirect using it, proxy_pass using it, or you will have a bad time.

https://reversebrain.github.io/2021/03/29/The-story-of-Nginx...


Thanks! That is good info.


The big one for me was the SSL cache size config generated by certbot combined with nchan. That caused so many production issues w/ many concurrent connections.


What is the current recommended web server these days?

Is it still nginx?

I know caddy gets a lot of attention but always assumed it was for smaller/less-trafficked sites (maybe I'm wrong).


Caddy is used by large enterprises and is processing hundreds of thousands of dollars of revenue per hour at one company I know of. You can definitely use it for high volume sites.


Editing the nginx config until worked correctly has cost me so many days of my life. Luckily there are very good Golang alternatives for 99% of use cases now.


So add doesn't (always) add, and instead you have to repeat yourself all over the place? That's just stupid.


We still cannot bind NGINX to a specific interface.

sad


What do you mean here? If you are referring to network interface, why there is a problem to bind to it?


You can’t bind nginx to a specific network interface.

You can bind it to a specific IP address though (and then, by extension, also to an interface, but you need to know the IP).


That's right, although I am struggling to think of use cases where you may need to assign it to network interface with dynamic IP allocation (otherwise you'd know the IP beforehand).


> although I am struggling to think of use cases where you may need to assign it to network interface with dynamic IP allocation

Some failover configurations come to mind, in particular the one where the box doesn’t get an IP address on the said interface until the failover kicks in, but I don’t know why would anyone run this kind of a setup without NAT.


I did that bind with DHCP server failover.


My use case is simple.

I have five disparate IP subnets whose firewall must NOW have to ascertain that ONLY a certain subnet can access this unbinded netdev … in case the NGINX (or any daemon) starts to sniff unwarranted IP subnet traffic.

And yes, you can bind to a dynamic IP assigned netdev.

And yes , I’ve had to replace NGINX with a web server that can do a netdev bind.


What upsets me about Nginx the most is that it takes the fail-fast approach, which isn't entirely suitable for combining it with a container workflow, at least in situations where you're running it without something like Kubernetes (which has its own customizations).

For example, currently i run my own Nginx ingress with Docker Swarm:

  service A / B / C <--> Nginx <--> browser
The reverse proxy configuration feels easier to do than that of Apache2/httpd, for example:

    server {
      listen 80;
      server_name app.test.my-company.com;
      return 301 https://$host$request_uri;
  }
  server {
      listen 443 ssl http2;
      server_name app.test.my-company.com;

      # Configuration for SSL/TLS certificates
      ssl_certificate /app/certs/test.my-company.com.crt;
      ssl_certificate_key /app/certs/test.my-company.com.key;

      # Proxy headers
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
   
      # Reverse proxy
      location / {
          resolver 127.0.0.11 valid=30s; # Docker DNS
          proxy_pass http://my-app:8080/;
          proxy_redirect default;
      }

  }
However, suppose that the "my-app" container has a health check set up for it. What does that mean? Docker Swarm won't route any traffic to it until it's finished initialization. Except that it also means that DNS requests won't return anything for "my-app" until at least one healthcheck passes. That in of itself shouldn't be too problematic, right? Any web server surely would just log that one request as an error and return an error until eventually everything would work.

Wrong. Nginx does throw the error and the request fails, but the ENTIRE web server goes down:

  nginx: [emerg] host not found in upstream "my-app"
What does that mean? If you have 1 Nginx instance on the server that needs to proxy 10 apps in total, one app not being available will prevent the other 9 from working! Whoever thought that such defaults would be okay has reasoning that is beyond me, especially considering that the Caddy web server does something similar and will go down if 1 out of 10 sites fails to provision the SSL/TLS certificate, instead of letting the others work - completely opposite of what someone who cares about resiliency would want!

But fear not, supposedly there's a solution!

  # Reverse proxy
  location / {
      resolver 127.0.0.11 valid=30s; # Docker DNS
      set $proxy_server my-app;
      proxy_pass http://$proxy_server:8080/;
      proxy_redirect default;
  }
Except that, you know, that also doesn't work:

  nginx: [emerg] "proxy_redirect default" cannot be used with "proxy_pass" directive with variables
Now, that's not an issue in my case, since i just jotted down the "proxy_redirect default;" value to remember what it does, but it's unfortunate to hear that it is neither ignored (since it's the default value which is functionally identical to it not being there), nor could other values be used: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#pro...

In summary, it's nice that there are workaround for problems like that, but it's a bit odd that the web server is otherwise so quick to fall over and die, as well as the fact that the defaults aren't oriented towards maximum reliability and resilience. That said, i still do like Nginx and actually wrote about how to use it as a reverse proxy on my blog a while back: https://blog.kronis.dev/tutorials/how-to-use-nginx-to-proxy-...

But at the same time, in situations where load isn't high enough for the web server to be the bottleneck, it's also helpful to evaluate the alternatives, notably: Apache2/httpd, Caddy and Traefik. It was actually pretty recently that Apache2/httpd got mod_md, which allows provisioning Let's Encrypt certificates out of the box, which is certainly promising: https://httpd.apache.org/docs/trunk/mod/mod_md.html


> especially considering that the Caddy web server does something similar and will go down if 1 out of 10 sites fails to provision the SSL/TLS certificate

What? That's not true. From the Caddy docs [1]:

> Caddy does its best to continue if errors occur with certificate management. By default, certificate management is performed in the background. This means it will not block startup or slow down your sites. However, it also means that the server will be running even before all certificates are available. Running in the background allows Caddy to retry with exponential backoff over a long period of time.

Caddy will actually keep your site online when other servers don't. See the bottom of this page: https://caddyserver.com/v2

[1]: https://caddyserver.com/docs/automatic-https#errors


It's also nice that this means you basically can't use the keepalive setting they mention in this article.

Or rather you can but apparently you need the expensive version of nginx to be able to have upstream servers that resolve hostnames sensibly.


I've assigned this the Global Security Database (GSD) ID GSD-2022-1000285 (https://raw.globalsecuritydatabase.org/GSD-2022-1000285) and started a discussion about these informational entries at https://groups.google.com/a/groups.cloudsecurityalliance.org...

TL;DR: if you run nginx I assume you want to know about these types of potential problems, so sticking it into the security database makes a lot of sense in my opinion for discoverability.


Why would you pick nginx over envoy?


Lots of things aren't cloud native applications. Most of the internet still runs on random C, Java, and PHP apps happily humming along on a handful of servers.


They aren't really the same type of product. Envoy is mostly an alternative to something like HAProxy, or Træfik. You wouldn't pick Envoy over Nginx (or Apache) if you need a webserver, than wouldn't make sense. Envoy solves a very specific problem, Nginx solves a bunch of problems.

Also if you're using Nginx everywhere else already, you might want to use a product you're comfortable with, even in the cases where Envoy could work for you. E.g. we use Apache almost everywhere, because performance is "good enough", and it allows us to use the same product for a lot of different client, due to the large feature set.


Serves static content (which envoy still can't do AFAICT)

Simpler to set up for a basic proxy.

For more complex usage though, nginx is certainly outclassed (at least the OSS version)


nginx is a reverse proxy and app server. It has all the major features to handle front/edge tasks (like TLS termination), serve static files, and run entire applications in-process given the correct modules. nginx has entire ecosystems around this with OpenResty and even runs the edge workload for Cloudflare's CDN and Algolia's search service.

envoy is a L7 proxy focused on full duplex networking across multiple protocols with transparency and observability. It can also handle front/edge tasks although the configuration system is a magnitude more complex than nginx and it's also missing the app server abilities.

If you use envoy already then it might be easier to deploy it to the edge as well, and it's often used that way in Kubernetes as both service-mesh and ingress - but if you just need typical reverse-proxy/app serving than nginx is still far simpler to setup and maintain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: