How to teach search engines not to give results with index.html, but still take it into account when crawling?

Z

zencd2013-01-25 13:18:33

robots.txt

zencd, 2013-01-25 13:18:33

There is such a usual site structure. It has a site search from Google, but they also come in simply from the search, of course.

/index.html
/page1.html
/page2.html
…

I would like to disable indexing on these pages. That is, so that the search engine sees them , follows the links, but does not give them out to people when searching (but gives out pages of separate, complete articles). Accordingly, robots.txt is no longer good, right?

The problem is that such intermediary pages are found in the search, cluttering it up and leading to the fact that by clicking on the link the desired content is not found (the content on the pagination pages changes).

What is done in such cases? We need compatibility with all major pliskoviks at least.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

K

KEKSOV, 2013-01-25
@zencd

If, apart from links from the index.html page, nothing else leads to pages with useful content, then you need to make a sitemap and place these links there, and close index.html for indexing.
If you need to close only part of the page from indexing, then use the tag<noindex></noindex>

W

WEBIVAN, 2013-01-25
@WEBIVAN

meta name="robots" content="noindex,follow"

C

charliez, 2013-01-25
@charliez

All major search engines support sitemaps sitemap.xml