Hacker News new | past | comments | ask | show | jobs | submit login

The best robots.txt for Majestic:

    iptables -A INPUT -s -j DROP

If only it was that easy. Last month MJ12Bot hit my site from 136 distinct IP addresses. If we drop the last octet, it's 120 unique class-C addresses, and if we drop the last two octets, then 43 unique class-B addresses (and why not---31 distinct class-A addresses). It's a distributed bot. Very hard to block, so I think I came out ahead by them no longer spidering my site.

Edit: Added count of class-A blocks.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact