Answer the question
In order to leave comments, you need to log in
How to allow access to the robots.txt file only to search robots?
How to allow access to the robots.txt file only to search robots?
Answer the question
In order to leave comments, you need to log in
www.useragentstring.com/pages/Crawlerlist I will
add - also do a reverse DNS to check that this is exactly Google
You probably don't want users to know your internal site structure. To remove this problem - transfer everything closed to one of the directories (partitions) and close it entirely. Then you won’t open the structure and explain to the robot that you don’t need to go there.
Defining search bots is as easy as shelling pears - almost all of them indicate their identifiers in the User-Agent, moreover, they go from certain ip subnet ranges. Googling a list of both is pretty easy.
Another question is that if you do not accidentally give access to one of the bots, then it will easily index all your pages and put them in the public domain. So it is better not to use this method.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question