G
G
gsdgdfgf2021-08-26 22:08:47
robots.txt
gsdgdfgf, 2021-08-26 22:08:47

How to make the correct robots.txt for a news site?

Hello! launching a news site, making robots, how to make it right for a news site? so far I've set it up like this:

User-agent: *
Allow: /wp-admin/admin-ajax.php
Allow: /*/uploads/
Disallow: /wp-admin/
Disallow: /cgi-bin/
Disallow: /?/
Disallow: /wp-/
Disallow: /wp/
Disallow: /*?s=/
Disallow: /*&s=/
Disallow: /search/
Disallow: /author/
Disallow: /users/
Disallow: /xmlrpc.php
Disallow: /*/trackback/
Disallow: /*/embed/
Disallow: /*utm*=/
Disallow: /*openstat=/

Sitemap: https://mysite.com/sitemap.xml
Sitemap: https://mysite.com/sitemap.rss


What to add? What to remove?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vladik Bubin, 2021-08-26
@ikoit

https://wp-kama.ru/id_803/pishem-pravilnyiy-robots...

A
Artem Zolin, 2021-08-27
@artzolin

I would remove everything from this list) Seriously, just don't let your text have links to search queries, /cgi-bin/ has not been used for a hundred years, the robot will not enter /wp-admin/ , and the authorization page is closed with a tag noindex, /xmlrpc.php is generally an inaccessible file from the front, you don’t have author pages, and if there is, why close them, let Openstat and UTM tags index, it’s a pity or something
And if you are going to write rules for robots, then for this there is a hook robots_txt. It works like this:

// Добавляем правила для файла robots.txt
add_filter( 'robots_txt', 'custom_robots_txt', 20, 2 );
function custom_robots_txt( $output, $public ) {

  $output .= "Disallow: /search/\n";
  $output .= "Disallow: /author/\n";
  $output .= "Disallow: /users/\n";
  
  return apply_filters( 'custom_robots_txt', $output, $public );
}

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question