T
T
Tsarev Vadim2020-08-15 11:56:04
Nginx
Tsarev Vadim, 2020-08-15 11:56:04

How to correctly block NGINX bots at the server level?

In access.log, bots
crawl through SemrushBot/6~bl
bingbot/2.0
YandexBot/3.0

(compatible; SemrushBot/6~bl; +http://www.semrush.com/bot.html)
(compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
 (compatible; YandexBot/3.0; +http://yandex.com/bots)

i blocked in nginx.conf
include /etc/nginx/conf.d/*.conf;
  
  #Список ботов
  map $http_user_agent $limit_bots {
     default 0;
     ~*(SemrushBot|SemrushBot/6~bl|YandexBot|YandexBot/3.0|bingbot|bingbot/2.0) 1;
    }

in site.ru.conf
location / { 
    if ($limit_bots = 1) { return 403;}
    if (!-e $request_filename){
      rewrite ^/sitemap.xml$ /sitemap.php;
      rewrite ^/sitemap(\d+).xml$ /sitemap$1.php;
    }
  }

already set this way and that SemrushBot|SemrushBot/6~bl
In access.log is not an absolute path of such a plan
- [15/Aug/2020:10:29:14 +0200] "GET /categ/product-13.html HTTP/ 1.1" 403 153 "-" "Mozilla/5.0 (compatible; SemrushBot/6~bl; + www.semrush.com/bot.html )"

403 does not load the server ? Thanks

Answer the question

In order to leave comments, you need to log in

3 answer(s)
K
ky0, 2020-08-15
@ky0

Thus, it is pointless to block bots. Those who honestly write about themselves in the user agent usually do not ask aggressively and, plus or minus, respect what is written in robots.txt. The vast majority of parsers who act as if they are not themselves pretend to be ordinary browsers.

K
kocherman, 2020-08-15
@kocherman

Which nginx? Why nginx? Your choice: Cloudflare / recaptcha!

D
Dr. Bacon, 2020-08-15
@bacon

If bots load your site, then the problem is not with the bots, but with you, spend time optimizing the site code, and not blocking these bots.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question