Answer the question
In order to leave comments, you need to log in
How does search engines index unrelated areas / site files?
Actually - a question. I am not yet versed in the topic of search engines, but I want to learn more about it. Now I'm interested in the following - how is the indexing of site resources that are in no way related to each other, i.e. they are not linked from other pages, they are not mentioned anywhere on the site, etc. How does the search engine know where this or that file is located if no links lead to it?
As I understand it, the search engine walks along the links, entering the directories available to it, and generally indexes all the files in it, even those that do not have any links. Or is there still some kind of connection to the server and an attempt to see what directories and files are there at all?
How true?
Answer the question
In order to leave comments, you need to log in
The search engine can get to the page:
- by a link to this page
- if such a page is in the sitemap
- the site exports data to the search engine in some other way (reviews, products in Yandex)
- the page has a counter from the search engine (analytics, metrics)
Stupidly brute force he will not try to find the page
Hmm, i.e. Should there be pointers to all of this? Those. if you put a file on the server that is in no way connected with the rest of the content, will it not be indexed?
Why did I ask, the other day I looked through a search engine for the presence of a passwd.dat file containing a login / password link for authorization forms, many search engines gave decent results with a bunch of sites where this file was found. So the question arose, how was it found? I don't think there's a link to it from the site. The only thought that comes to mind is that the path to it may contain it in some other script that lies on the site. But this seems to me somehow unlikely.
All sorts of Google and Yandex bars can also “knock” when you yourself access a resource (to which there are no links) using the full path.
1) isn’t there a menu with links to sections
2) why a site that does not have competent internal linking of sections
The question is not about any specific site, but about in general, on which there are files like passwd.dat, and naturally there are no links to them on the site itself. I'm more inclined to believe that the search engine lists the directory available to it. I don’t know how to explain it, like the “dir” command for the console in Windows, which displays the contents of the directory. Is it possible?
I don't know for sure, but I admit that if the site has a link like /dir1/dir2/dir3/, search engines can check both /dir1/dir2/dir3/ and /dir1/dir2/ and /dir1/. Thus, if, for example, apache is installed on the server with the mod_autoindex module enabled and not configured properly, then a list of files in these directories will be displayed.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question