A
A
Alexander Sharomet2015-08-25 22:16:31
PHP
Alexander Sharomet, 2015-08-25 22:16:31

How to allow access to the robots.txt file only to search robots?

How to allow access to the robots.txt file only to search robots?

Answer the question

In order to leave comments, you need to log in

4 answer(s)
D
Dimonchik, 2015-08-25
@sharomet

www.useragentstring.com/pages/Crawlerlist I will
add - also do a reverse DNS to check that this is exactly Google

O
OnYourLips, 2015-08-25
@OnYourLips

No way. There is no way to know if a user is a robot.

O
Oleg Shevelev, 2015-08-26
@mantyr

You probably don't want users to know your internal site structure. To remove this problem - transfer everything closed to one of the directories (partitions) and close it entirely. Then you won’t open the structure and explain to the robot that you don’t need to go there.
Defining search bots is as easy as shelling pears - almost all of them indicate their identifiers in the User-Agent, moreover, they go from certain ip subnet ranges. Googling a list of both is pretty easy.
Another question is that if you do not accidentally give access to one of the bots, then it will easily index all your pages and put them in the public domain. So it is better not to use this method.

X
xmoonlight, 2018-10-31
@xmoonlight

UserAgentList
Crawlers IP List

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question