The folks at Cloudflare have done it with an iptables TPROXY rule (which requires the socket to have the IP_TRANSPARENT option) which is how I did it too. But there is another way to do this in Linux: you can use an iptables REDIRECT rule, and the userspace program can obtain the original destination port by doing a getsockopt() call to read SO_ORIGINAL_DST.
Edit: oh I see now the blog post does mention the REDIRECT & SO_ORIGINAL_DST option, but criticize its performance... which makes sense given its dependence on conntrack.
There is a typo in Cloudflare's blog post: s/SO_TRANSPARENT/IP_TRANSPARENT/
In practical terms to recover original target in REDIRECT you have to use the obscure SO_ORIGINAL_DST, while for TPROXY getpeername() will just work.
By this token TPROXY is a bit easier to use. This is for TCP. UDP is a bit harder.
perhaps this works poorly for firewalls near to the service but you declared the problem to be one close to the client. AIUI
But you don't know who dropped it: the ISP or the remote server. In order to show it's the network between the client and server dropping it, you need a server that behaves in a known way, hence open.zorinaq.com I used to work in the InfoSec industry, running port scans from various locations, and open.zorinaq.com was incredibly useful to ensure there was no random firewall preventing us from finding certain open ports. That was the primary motivation why I built the service.
# TPROXY directs all traffic to :1234, and these rules load balance to 4 different processes
iptables -t nat -I OUTPUT -p tcp -o lo --dport 1234 -m state --state NEW -m statistic --mode nth --every 4 --packet 0 -j DNAT --to-destination 127.0.0.1:8080
iptables -t nat -I OUTPUT -p tcp -o lo --dport 1234 -m state --state NEW -m statistic --mode nth --every 4 --packet 1 -j DNAT --to-destination 127.0.0.1:8081
iptables -t nat -I OUTPUT -p tcp -o lo --dport 1234 -m state --state NEW -m statistic --mode nth --every 4 --packet 2 -j DNAT --to-destination 127.0.0.1:8082
iptables -t nat -I OUTPUT -p tcp -o lo --dport 1234 -m state --state NEW -m statistic --mode nth --every 4 --packet 3 -j DNAT --to-destination 127.0.0.1:8083
For the accept-queue load balancing see these blog posts:
Does OpenBSD handle this differently than Linux, or am I doing this wrong?
(N.B.: I won't even be starting on this for probably a month or two so I haven't even begun to look into it. If there is documentation easily/readibly available via a Google search (i.e., I'll find 'em as soon as I Google for 'em) then just ignore my request. Thanks!)
This Python example sets up a transparent HTTP proxy which will show you the basic socket stuff you need to get going.
To clarify my original question, what will cloudflare do if/when iptables finally goes away? Has thought been put into it? Will they implement their own type of TPROXY? Will they continue to support iptables themselves? There's quite a few paths, and I'm interested in which one they deem most optimal because I respect their opinions a lot.
here's a 50 line kernel module that uses TPROXY to do the samething without touching iptables.
looking at the nftables code, I think the only reason nftables doesn't support TPROXY is that no one wrote some of the config parsing / seralization stuff.
Being able to setup a Gitlab/Gitea server behind Cloudflare without having to hack around the SSH port limitation would be fun.
Any any ideas for an explanation of this recommendation?