# Observed spamming large amounts of https://en.wikipedia.org/?curid=NNNNNN
# and ignoring 429 ratelimit responses, claims to respect robots:
iptables -A INPUT -s 22.214.171.124/32 -j DROP
Edit: Added count of class-A blocks.
To be honest, robots.txt is not for these kinds of bots. These kinds of bots are either malicious or incompetent. But more importantly, they're 100% useless to you as a website operator. They offer no SEO benefit, drive no significant traffic and simply consume resources.
The answer, sadly, is to hit them at the web server / load balancer / reverse proxy layer and just bruteforce all these bad actors away.
They'll never stop trying, though. Checking some NGINX logs for some of these bots that have been blocked for years, they still knock on the door over and over again.