D
D
Dextor232020-01-14 14:06:59
robots.txt
Dextor23, 2020-01-14 14:06:59

Is this a bug in robots.txt?

How is robots.txt filled in correctly?
I read earlier that the logic should be like this
. First we write what cannot be scanned by the bot, and then we allow everything that is possible

User-agent: *
Disallow: /cabinet/
Disallow: /cabinet/register/
Disallow: /cabinet/forget/
Disallow: /search/
Allow: /

User-agent: Yandex
Disallow: /cabinet/
Disallow: /cabinet/register/
Disallow: /cabinet/forget/
Disallow: /search/
Allow: /

Host: site.com 
Sitemap: https://site.com /sitemap.xml

Did I get it right or not? Why am I asking this question because I see that other sites only have
Disallow:
Disallow:
Disallow:
and no Allow: at the end /
Please explain

Answer the question

In order to leave comments, you need to log in

1 answer(s)
N
Notan Royamov, 2020-01-14
@Royamov

Explicit indication of Allow: / is tantamount to its absence (the default is to allow the indexing of the entire site), so they don’t write :)
The correct robots.txt in your case will be like this:

User-agent: *
Disallow: /cabinet/
Disallow: /search/
Sitemap: https://site.com/sitemap.xml

Host is an obsolete directive (besides, it must be specified with https if the main mirror is available via a secure protocol), the rest is just superfluous.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question