Random "wisdom", not in any particular order more like do's and dont's that I picked up with dealing with and executing DoS/DDoS attacks.
Testing, testing, testing, regardless of how you choose and what you implement your mitigation test it and test it well because there are a lot of things you need to know.
Know and understand exact effect that the DDOS/DoS mitigation has, the leakage rate, what attacks can still bring you down, and the cost of mitigation.
Make sure you do the testing at different hours of the day if not you better know your application and every business process very well because I've seen cases where 50GB/s DDoS would do absolutly nothing except on tuesday and sunday at 4AM when some business batch process would start and the leakage from the DoS attack + the backend process would be enough to kill the system.
Common processed that can screw you over are backups, site to site or co-location syncs/transfers, various database wide batches, pretty common times for this anything in early morning, end of weak, end of month, end of quarter etc.
If you are using load or stress testing tools on your website make sure to turn off compression it's nice that you can handle 50,000 users that all use GZIP but the attackers can choose not too.
Understand what services your website/service relies on for operation common things are services like DNS, SMTP etc. if I can kill your DNS server people can't access your website, if i can kill services that are needed for the business aspect of your service to function like SMTP I'm effectively shutting you down also.
If you are hosting your service on Pay As You Go hosting plans make sure to implement a billing cap and a lot of loud warnings, your site going down might not be fun, but it's less fun to wake up to a 150K bill in the morning, if you are a small business DoD/DDoS can result in very big financial damages that can put you out of business.
Understand exactly how many resources each "operation" on your website or API costs in terms of memory, disk access/IOP's, networking, DB calls etc, this is critical to know where to implement throttling and by how much.
If you implement throttling always do it on the "dumber" layer and the layer that issues the request for example if you want to limit the amount of DB queries you execute per minute to 1000 do it on the application server not on the DB server.
This is both because you always want to use "graceful" throttling which means the requesters chooses not to make a request rather than the responder having not to respond, and it also allows you to implement selective throttling for example you might want to give higher priority to retrieving data of existing users than to allow new users to sign up or vice versa.
Do not leak IP address this is both in regards to load balancing and using scrapping services like Cloudflare.
When you used services like cloudflare make sure that the services you protect are not accessible directly, make sure some one can't figure out the IP address of your website/API endpoint by simply looking at the DNS records.
Common pitfalls are www.mysite.com -> cloudflare IP while mysite.com/www1.mysite.com/somerandomstuff.mysite.com reveal the actual IP address. Another common source is having your IP address revealed via hard coded URLs on your site or within the SDK/documentation for your API.
If you have moved to cloudflare "recently" make sure that the IP address of your services is not recorded somewhere there are many sites that show historic values for DNS records if you can it is recommended to rotate your IP addresses once you sign up for a service like cloudflare and in any case make sure you block all requests that do not come through cloudflare.
When you do load balancing do it properly do not rely on DNS to for LB/round robin if you have 3 front end servers do not return 3 IP addresses when some one asks whois www.mysite.com put a load balancer infront of them and return only 1 IP address.
Relying on DNS for round robin isn't smart it never works that well and you are allowing the attacker to focus on each target individually and bring your servers one by one.
Do not rely on IP blacklisting and for whatever reason do not ever ever ever use "automated blacklisting" regardless of what your DDoS mitigation provider is trying to tell you. If you only service a single geographical region e.g. NA, Europe, or "Spain" you can do some basic geographical restrictions e.g. limit access from say India or China this might not be possible if you are say a bank or an insurance provider and one of your customers has to access it from abroad.
Ironically this impacts the sites and services that are the easiest to optimize for regional blocking for example if you only operate in france you might say ha! I'll block all non-french IP address but this means that what an attacker needs to do is simply use IP spoofing and go over the entire range of French ISP's and you blacklist all of France this only takes a few minutes to achieve!
If you are blacklisting commercial service provider IP's make sure you understand what impact can it have on your site, blacklisting DigitalOcean or AWS might be easy but then don't be surprised when your mass mail services or digital contract services stop working.
If you do use some blacklisting / geoblocking use a single list that you maintain do not just select "China" in your scrapping service, firewall, router, and WAF all of them can have different Chinas which causes inconsistent responses, use a custom list and know what is in it.
Do not whitelist IP! I've seen way too many organizations that whitelist IPs so those IPs would not go for example through their CDN/Scrapping service or would be whitelisted on whatever "Super Anti DDoS Appliance" the CISO decided to buy into this month.
IP spoofing is easy! drive by attacks are easy! And since a common IPs to whitelist are things like your corporate internet connection nothing is easier for an attack to do than to figure those out.
They simply need to google for the network blocks assigned to your organization if you are big enough and or were incorporated prior to 2005 or send a couple of 1000's of phishing emails and get do some sniffing from the inside.
Understand collateral damage and drive by attacks. Know who (if) you share your IP addresses with and figure out how likely they are to be attacked, yes everyone would piss some one with keyboard access these days but there are plenty of types of businesses that are more common as targets, if you are hosting in a datacenter that also provides hosting for a lot of DDoS targets you might suffer also.
For drive by attacks you need to have good understanding of the syndication of your service and if you are a B2B service provider your customers. If you provide some embedded widget to other sites if they are being DDoSed you might get hit also if it is a layer 7 attack.
If you are providing service for businesses for example an address validation API you might get hammered if one of your clients is being DDoSed and the attacker is hitting their sign up pages.
Optimize your website; remove or transfer large files things like documents and videos can be moved to various hosting providers (e.g. YouTube) or CDN's, if you are hosting large files on CDN's make sure they are only accessible via the CDN, infact for the most part it's best if you make sure that what is hosted on the CDN is only accessible via the CDN this prevents attackers from accessing the resources on your own servers via selecting your IP instead of the CDN. A common pitfall would be that some large file is linked on your website as cdn1.mysite.com/largefile but it's also accessible directly from your servers via www.mysite.com/largefile.
Implement anti-scripting techniques on your website, captcha, DOM rendering (makes it very expensive for the attacker to execute layer 7 attacks if they need to render the DOM to do so) and make sure that every "expensive" operation is protected with some sort of anti-scripting mechanism.
Test this! captchas that are poorly implemented are no good, and I don't mean captchas that are somehow predictable or easy to read with CV's if you have a services that looks like this LB>Web Frontend>Application Server>DB make sure that the captcha field is the 1st thing that is being validated and make sure it's validated in the web frontend or even in the LB/Reverse Proxy. If you hit the application server validate all the fields do the thing and just before sending it to the DB you validate the captcha this won't help to protect you against DoS/DDoS as well if at all.
When you implement any mitigation design it well and understand leakage and "graceful failure", it's better for the dumb parts of your service to die and restart than it is for the more complicated parts.
For example if after all of your mitigation you still have 10% leakage from your anti-ddos/scraping service to your web frontend and from it there is a 5% leakage to to your DB do not scale the web frontend to compensate for the leakage from your scrapping service to the point of putting your DB at risk. A web server going down is mostly a trivial thing as it would bring itself back up usually on its own without any major issues, if your DB gets hammered well it's a completely different game you do not want to run out of memory or disk and to have to deal with cache or transaction log corruption or consistency issues on the DB.
Just get used to the fact that no matter what you are going to do and implement if some one wants to bring you down they will, do what you can and is economical to you do mitigate against certain attacks and for the reset design your service with predicted points of failure that would recover on their own in the most graceful manner and shortest period.
Also in some cases you want to disable accepting GZIP on the server side completely because if you accept GZIP encoded requests an attacker can send very large requests that compress very well forcing you to decompress them eating up a lot of memory and CPU cycles on your side only to discard them.
In principle you want to accept only non-compressed requests but send compressed responses to save bandwidth, but in any case you want to know how your service/application scaling works in with all cases and combinations.