Rule 1: Allow Good Bots | Rule 2: Block Potentially Malicious Requests | Rule 3: Block Bad Bots | Rule 4: JS Challenge
Good bots are whitelisted by Rule 1. Some bad bots will be blocked by Rule 2. I’d like Rule 3 to block *all* remaining bots, but that isn’t possible as far as I know. I’ll block as many as I can.

1. Empty User Agent. Bad bots sometimes have an empty user agent. This should never be the case with legitimate traffic – although it is possible. Whether to block empty user agents is debated on the interwebs, with knowledgeable folks opining on both sides. Based on what I find in my firewall log, an empty user agent is almost invariably associated with malicious activity. So, blocked. As with TOR, I’ll live with the risk of blocking a very small amount of legitimate traffic as collateral damage.
(http.user_agent eq "")
2. Naughty Request Methods. Certain request methods are never used by my site, but they could be used by bad bots.
(http.request.method in {"CONNECT" "DEBUG" "DELETE" "MOVE" "PUT" "TRACE" "TRACK")
3. Upload Scanners. Jeff Starr of Perishable Press identifies todaperfeita as an especially troublesome upload scanner. His excellent 7G firewall also blocks semalt.
(http.referer contains "semalt.com") or
(http.referer contains "todaperfeita")
4. Duplicator Download Attack. A recent widespread attack targeted a vulnerability in the Duplicator plugin, which I sometimes use. The vulnerability was immediately patched, so it no longer poses a hacking risk. But bots still hammer away looking for it.
(http.request.uri.query contains "duplicator_download") or
(http.request.uri.query contains "/wp-config.php")
5. Robots.txt Scofflaws. A few research bots, which are not outright malicious but do me no benefit either, disregard my robots.txt directives.
(http.user_agent contains "researchscan") or
(http.user_agent contains "proximic") or
(http.user_agent contains "Proximic") or
(http.user_agent contains "grapeshot") or
(http.user_agent contains "Grapeshot") or
(http.user_agent contains "filterdb.iss.net)
6. Etc. A final scatter-gun blast to block as many remaining bots as I can, based on common bot user agent snippets.
(http.user_agent contains "crawl") or
(http.user_agent contains "Crawl") or
(http.user_agent contains "CRAWL") or
(http.user_agent contains "bot") or
(http.user_agent contains "Bot") or
(http.user_agent contains "BOT") or
(http.user_agent contains "spider") or
(http.user_agent contains "Spider") or
(http.user_agent contains "SPIDER") or
(http.user_agent contains "spyder") or
(http.user_agent contains "Spyder") or
(http.user_agent contains "SPYDER")
That’s it. Here’s the full set for Rule 3 …
(http.user_agent eq "") or
(http.request.method in {"CONNECT" "DEBUG" "DELETE" "MOVE" "PUT" "TRACE" "TRACK"}) or
(http.referer contains "semalt.com") or
(http.referer contains "todaperfeita") or
(http.request.uri.query contains "duplicator_download") or
(http.request.uri.query contains "/wp-config.php")
(http.user_agent contains "researchscan") or
(http.user_agent contains "proximic") or
(http.user_agent contains "Proximic") or
(http.user_agent contains "grapeshot") or
(http.user_agent contains "Grapeshot") or
(http.user_agent contains "filterdb.iss.net) or
(http.user_agent contains "crawl") or
(http.user_agent contains "Crawl") or
(http.user_agent contains "CRAWL") or
(http.user_agent contains "bot") or
(http.user_agent contains "Bot") or
(http.user_agent contains "BOT") or
(http.user_agent contains "spider") or
(http.user_agent contains "Spider") or
(http.user_agent contains "SPIDER") or
(http.user_agent contains "spyder") or
(http.user_agent contains "Spyder") or
(http.user_agent contains "SPYDER")
Then: Block

