Hacker News new | past | comments | ask | show | jobs | submit login
Application-Layer DDoS Attack Protection with HAProxy (haproxy.com)
119 points by phil21 on Nov 9, 2018 | hide | past | favorite | 31 comments

I'm the author of this, and have done quite a bit with DDoS and HAProxy; feel free to ask me any questions.

Thankyou for this.

I make heavy use of HAProxy and use the silent-drop capabilities on my own personal sites. In addition to stick tables, I also use it heavily with acl's that pattern match on common attacks and enumeration.

My favorite stick table uses http_err_cnt to block enumeration attempts.

    stick-table	type integer size 127k expire 30m store http_err_cnt
    acl		abuses sc0_http_err_cnt gt 6
    http-response	silent-drop if abuses
I also find it useful to define known bot user-agents and IP addresses, then add conditional ACL's to enforce my robots.txt, since many search engines just ignore it.

    acl BOTS src -f /opt/.ha/bot.txt
    acl BOTS hdr_sub(user-agent) bash url get lack ing team oog lurp ook rawl bot map dback ltx pider scan bmast robe ithub hatweb ansci rchiv null isco =
    acl FILES path_end .html .pl .jpg .jpeg .gif .png .gz .xz .7z .zip .rar .m3u .ogg .mp3 .mp4 .avi
    acl FILES url_sub /post /upload /images
    acl PROT hdr_sub(host) site1 site2 site3
    http-request silent-drop if BOTS FILES PROT
There's probably a more cpu efficient way to do the above. It was just a quick hack to make robots obey and is not seo friendly at all. Maybe you have ideas how I could optimize that.

For the first example that looks good to catch scanners and the like, though it needs an "http-response track-sc0 src" or nothing will get stored in the stick table. The second example it looks good; since there aren't any regexes in there the ip/substring/end lookups are quite CPU friendly so shouldn't have a large impact even at higher request rates (IP ACL files notably get loaded into a binary tree format and take next to no CPU work to check for a match even with millions of IP's).

Ah yes, I left off the track-sc0 from my config. Good catch. I tried to stay away from regex or long strings. Ty for verifying that it looks ok.

Partially true. Sub matching is stored in chained lists and could take some CPU if improperly used

So correct me if I'm wrong, this solution is great if you're the target of a DDoS targeted on overloading your backend services via many malicious requests. However, if you are the target of a bandwidth consumption attack, there isn't much HAProxy can do, correct?

There's still the fundamental limitations of incoming bandwidth, system pps limits, etc., etc.

For volumetric attacks (where the attacker is using DNS reflection or similar to do nothing but eat your bandwidth) pretty much the only solution is to work with providers to block the sources/protocols to stop the majority of the traffic before it hits the fiber into the proxy servers. For SYN floods (where spoofed SYN packets are used to fill up the proxies connection table) the kernels SYN cookie functions can be used to ignore invalid SYN's, or if that isn't enough something like our ALOHA's PacketSheld to prevent the reflected/spoofed traffic from reaching the proxy at all (but still useless if the 10gb line is completely full). HAProxy comes into account more if they are actually establishing a TCP connection even if they don't actually make a request, as then conn_rate/conn_cur trackers can be used to refuse/drop the connections (and with silent drop fill up their connection tables and make it harder for them to make more). If they actually make a full HTTP request, then the http_req_rate comes into play to 403 the requests (and can be combined with connection rate limiting to reduce the amount of traffic in HTTP requests they send).

Most volumetric attacks are UDP reflection based, and relatively simple to filter out at the network level using ACL's. When using decent network gear, these can be filtered at line rate using ASICs. These comprise the vast majority of real-world DDoS attacks.

Having the ability to protect against more focused attacks with HAProxy could be very useful in those few instances where volumetric attacks are not being used.

Not who you’re asking but I doubt it. You’d need a service or infrastructure that can absorb (and probably distribute the load across PoPs) before scrubbing it and sending it to your application.

I have an issue not sure how best to solve and isn't DDoS related. We issue our product partners with a rate limit of 10,000 API requests per hour, sometimes 50,000 API requests. That's fine for us, 3 requests a second~ barely volume worth stressing about, but sometimes the partners will fire off 500 concurrent API calls per second. I don't have to want to increase our network to cope with that volume so I'm looking at options to solve this headache which limits them to 20 requests maximum per second. (There's just no reason why they would ever need more -- they have bigger processing capacity than us.. I don't want to have to increase our costs if we can avoid it).

At the moment, I'm using AWS and Elastic Load Balancer so, I'm just thinking HAProxy just doesn't fit well within the existing network layout.

Additionally, i'm not sure if I should just be solving this with a simple nginx/redis lua script. If you have any recommendations I'd love to hear how other people solve this rate limiting/throttling problem.

Even if HAproxy isn't suitable, I'd like to hear the "HAProxy" solution for it...

In our maps blog post we have a section where we talk about applying different rate limits for different paths (https://www.haproxy.com/blog/introduction-to-haproxy-maps/#r...). For this I'd change the path fetch to a req.hdr(authorization) (or however you send API keys) to limit per API key. You could also add a 20/second with another stick table using the methods in this DDoS post (just with a string key to store API key rather than ip/src to store the IP address).

You can put HAProxy directly on a box running another service to provide protection/reporting if you don't want to replace or add HAProxy to your ELB's (though sometimes that just adds too much complexity), we also talk about using HAProxy and ELB in our AWS blog post (https://www.haproxy.com/blog/haproxy-amazon-aws-best-practic...).

You might want to check out this post on using HAProxy as an API Gateway:


It has a section on rate limiting. Specifically check out the section which starts with the sentence: "Optionally, instead of a daily limit as used above, you can also do it based on the rate of the requests."

The article mentions attacks from clients sending slow requests, but what about clients reading the response very slowly?

And coming back to the topic of slow requests, what if an attacker tries a slow POST request, sending the first part of the request at a normal speed until it reach the size of the http-request-buffer, and then sending the rest very slowly? haproxy would let the request pass and the backend will be kept busy.

I've been learning HAProxy quite extensively over the last few weeks in preparing our network for scaling. It seems like the features are endless but this guide was a fantastic introduction to some of the more intermediate features. Much appreciated!

I'm curious but is SMTP DDOS a thing? Are there attacks around this? If so has anyone developed good mitigations at a networking level I mean?

I'd love someone to write about that. Everything is always about HTTP/S

Its not terribly common to attack the SMTP service itself, as usually spam can be sent to cause problems outside of the availability of the SMTP service itself. If SMTP traffic is put through HAProxy conn_cur and conn_rate trackers can be used to limit connections per source IP. "inspect delay" can be used to cause SMTP connections to be held in HAProxy for a while, which makes mail flooding substantially more difficult as well (there is an example of that in the docs if you search for inspect-delay).

We have updated the post and added a section to match this use case. You can find it under the subsection "Protecting TCP (non-HTTP) Services"

Although this helps for some cases, its a bit disingenuous, if an attacker has more bandwidth than you (easily achievable witch amplification attacks), its game over.

The majority of DDoS protection is all about raising the bar so the people attacking you will have more fun attacking someone else and/or being happy they're attacking your www instead of your service (because taking down www requires no research to find the target and provides more lols). If you upset the wrong people, you may attract more determined attackers though, you still want a high bar, because that makes it more likely to be trackable.

There's a few classes of effective DDoS:

A) volumetric traffic/packet counts: the only effective thing here is to have tons of bandwith or use a service that does. Some of the UDP reflectors out there have a very high amplification rate. Null routing can absorb the bandwidth, but smarter attackers will notice when you move your service and change the target.

B) application related, but not application specific. Things like slowlaris to hold connections, or just https floods to use all your handshakes/second. Some of these you can filter, but you sometimes just need more machines to process everything. Apparently HAProxy can help with slowlaris by detecting and dropping slow connections.

C) expensive requests / processing at the application level. If you have a public endpoint that takes 10 seconds to process a request that takes effectively no time to generate, that's definitely a potential thing to be attacked, and that's something HAProxy can definitely help with.

D) Issues in the IP/tcp stack. Sometimes there's gray areas or infrequently used corners of the processing that are very expensive if used frequently (ex: IP fragment reassembly). HAProxy won't help there.

If HAProxy (or whatever) can help you with low bandwidth DDOS, I think that's still pretty useful.

Let's say you have an office on a Gigabit fiber connection going to a network switch capable of processing 10gbe (assume 10x speed). Hypothetically, would the switch be able to respond quickly enough such that even if the gigabit WAN connection was saturated it would degrade performance? It's it just a matter of computing power to deal with the network traffic or do you actually need to have bigger network pipes than the attack bandwidth?

Potentially, your equipment could be incapable of handling packets at line rate (which is more difficult if the packets are small). That's fairly easy to solve though, especially if you're only looking at gigE -- get better hardware and/or software

The problem is if your attacker is sending you more traffic than your incoming bandwidth. Packets will be dropped, and in most cases you won't be able to control which ones. Depending on how the other side is configured, what packets you do get could be highly delayed. That means actual connections to you are likely going to see a lot of retransmits to you as well as from you. It's possible to still make some progress in these conditions, but not very much, processing power won't help.

I've done very effective application-layer DDoS attacks in the past with a meager ADSL over Tor. Completely free. But of course you can also spend thousands in a botnet to flood someone off the planet. That's not the point.

Booters are cheaper than $thousands. And while it may cost someone $thousands to keep you offline for days, it probably doesn't. And even if it did, you probably want a better response than "guess we're offline now, lol, l8r" when someone chooses to spend $dozens shutting you down for a couple hours.

If DDoS protection isn't the point, what is?

I don't think it's discussed enough in our circles the various aspects of the internet that are more or less broken. Like how easy it is for anyone to take you offline. How easy it is to spoof IP addresses. How useless IP address blocking is. How we demand infinite bandwidth for a low, fixed, monthly price yet don't want to be on the hook when our toaster is DoSing our neighbor and causing real financial damage.

But at the same time we share these little haproxy/fail2ban tips that don't work under actual threat, and then we lament that people use services like CloudFlare instead of talking seriously about how we depend on the free services of large companies, whether it's CloudFlare's DDoS protection or Google's reCaptcha, to prevent real abuse.

Is Cloudflare effectively a super large HAProxy instance? I've always wondered how they are able to absorb so much traffic.

I don't think they use haproxy (or at least they don't heavily rely on it). But once you start with properly scalable tools, you "just" need to have a high bandwidth and many machines, and everything becomes easy. Think about it for a second, put a 40 GbE NIC into an single-socket haproxy 1U pizza box, you get this for $800. Take 25 of these in a rack, connect this to an L3 switch doing ECMP and you have 1 Tbps of DDoS absorption capacity. For $16K. I know pretty well I'm oversimplifying the problem, but it always starts this way, and after this you adjust for various aspects (small packets, reflection using tools like PacketShield, TLS handshakes using more CPU cores, large connection counts using more RAM) and that's about all.

The heaviest and hardest to maintain features in these environments are the fat stuff that customers want (WAF, monitoring, UI, config versioning, etc). But basic protection is trivial if you can afford the bandwidth.

Wow very interesting! Obviously, you simplified a good deal but it's "relatively" much simpler than what I had envisioned.

Any info on what the other topics will be in this series or is that kept as a surprise?

Soon we'll release a post about bot detection and protection with HAProxy. If you'd like to see us cover a specific topic feel free to reply to me here and I'll pass it onto the team.

Is peering the same as the stick table aggregator?

No, the peers protocol only syncs the stick-table data between other haproxy instances, but doesn't aggregate the values. The stick table aggregator aggregates all of the values from all of your haproxy instances.

Assume that you have 2 haproxy instances and a client makes 2 requests and each request lands on a separate haproxy instance. With peers, the request count would sync as "1" but with the stick table aggregator the request count would sync as "2".

By the way, when you have very few devices to aggregate values from (let's say only 2 haproxy nodes), there's a trick you can use to still aggregate their values over peers, assuming you are willing to maintain different configurations on each node.

The principle is that there will be one table per metric and per node, and each node will track values using its own table. This way each table is written to by only one node, and all nodes get all the table updates. They are then free to watch other nodes' values in their respective tables.

Using the arithmetic converters it's even possible to add or average values between all tables, but then this quickly becomes complicated, as for example you can't easily take into account the number of online nodes in your operations. This is why this is mostly workable with two nodes and not really much more. In this case of two nodes, you either use your own value or the average operation if you can ping the neighbor.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact