K
K
Konstantin2019-10-22 08:52:51
Search engines
Konstantin, 2019-10-22 08:52:51

How will the search engine behave?

Let's say there is a site on which there are only 3 pages, they have some kind of text
example.com , example.com/page1.html, example.com/page2.html
Links to these pages are arranged in the chain
main -> page1 -> page2
In robots.txt, only page1 is blocked for all bots. And there are no additional meta tags on any page.
Before indexing, the search robot only knows about the main one and there is no information about the site in other sources (for example, other sites).
Will the robot index the page2. if the link to it is only on page1.
Whether the robot follows links on blocked pages.
Please support your answer with a link to documentation. For simplicity, let's take two search engines Yandex and Google.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
P
pcdesign, 2019-10-22
@andronof

The robot will go wherever you can go and index everything that is not prohibited.
And what is forbidden will also be indexed, it just will not participate in the search, and it will not build snippets.
And not to be unfounded, here is a site with robots.txt like this. That is, everything is prohibited:

User-agent: *
Disallow: /

Gosha indexed it perfectly, but did not create snippets. Screenshot
Well, now the documentation, quote: "The robots.txt file tells search robots which pages or files on your site can or cannot be processed. Use it to limit the number of requests your server receives and reduce the load on it. This file is not intended to prevent web pages from showing up in Google search results.If you do not want any material from your site to be submitted to Google, use the noindex directives.You can also create password-protected sections on the site. "
https://support.google.com/webmasters/answer/60626...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question