How to exclude list sections from indexing by search engines?

U

unclechu2015-01-17 02:14:41

HTML

unclechu, 2015-01-17 02:14:41

On any typical site, as a rule, there are certain list sections, the most commonplace is news. There is a page with a list of news, which has pagination in one form or another (several pages are already obtained), there are many detailed pages of individual news. Pagination, as a rule, in its usual manifestations, shifts the content on the page with new elements, that is, after some time, the list will no longer contain the news that was there before (they will be on another page). It makes no sense to index the list of news itself, both semantically and because it is possible that when a user comes there by searching, he will no longer find there the information that was there quite recently, but it makes sense to index the detailed pages of these news. How to explain to search engines

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

N

Nikita Tarasov, 2015-01-17
@unclechu

Make automatic linking on detailed pages with news - for example,
Previous news (link to news that was earlier than the current one) and Next news (link to news that was later than the current one) - you need search robots to be able to navigate through the news pages
Dynamic XML map it is also desirable to implement the site (it should not contain news pagination pages)
Close the news pagination pages with a rule via robots.txt - the main thing is that there is no ban on indexing detailed news pages
You can check the correctness of robots here

V

Vladimir Abramov, 2015-01-17
@kivsiak

https://ru.wikipedia.org/wiki/%D0%A1%D1%82%D0%B0%D...
Then
https://help.yandex.ru/webmaster/controlling-robot...
https: //support.google.com/webmasters/answer/60626...

A

Andrey Lipattsev, 2015-01-17
@HabrAndrey

Только помните, что robots.txt управляет доступом, то есть сканированием, а не индексированием. Проиндексировать URL можно и не заходя на него, хотя информации о содержании страницы у о поисковика тогда не будет.
Конечно, же пользуйтесь robots.txt, но для исключения индексирования (как превентивного, так и пост-фактум) для Google лучше всего использовать метатег noindex. Кроме того полезно ознакомиться с разделом Параметры URL в WMT.

�

Глюкъ Виртуален, 2015-01-17
@gluck59

Глупости это все и от лукавого.
Достаточно будет обрамить ненужное тегом noindex

I

icetomcat, 2015-06-17
@kostia256

С другой стороны может понадобиться сканирование списков(вопрос-ответ, комментарии и т.д.) и при этом избежать этой свистопляски со страницами. Логика тут простая, если я показываю список вверх тормашками(т.е. от новых записей к старым), то и страницы в url я тоже должен перевернуть.
Правда не очень эстетично получается, если пользователь нажимает в пагинации на первую страницу, а в url ему ?page=42 подставляется, и наоборот, нажимает на последнюю, а в url ?page=1. Но зато всё всегда на своем месте.