2015-04-09 20:09:49
, 2015-04-09 20:09:49

AJAX crawling: is it possible to tell GoogleBot that some parameters in the hash url should not be attempted to be indexed?

So, following the AJAX crawling technology, to take an html screenshot of the application page, all parameters after #! are passed to the server mirror script with the get parameter _escaped_fragment_.
Everything just seems, or maybe I just haven’t understood anything yet =) But what if there is a parameter in the hash-url that should not be indexed - i.e. GoogleBot should not read it beyond the separated page and pass it to _escaped_fragment_ ?
For example, there is an application with a certain url, the [&overlay=xxxx] parameter should not be taken into account when indexing, i.e. the robot should consider that these urls are equivalent and represent one page:

  • site.com/#!users/top/?q=lamak
  • site.com/#!users/top/?q=lamak&overlay=ololo
  • site.com/#!users/top/?q=lamak&overlay=doesnt_matter_what

That is, how to separate the parameters that affect the content and which are service only for the Klin side, not for indexing, so that the robot does not try to request 3 pages using these urls and one would get into the index? Of course, you can disable manual indexing of pages, but is there a way to do this at the url level?

What happens if you refuse AJAX crawling and just slip Google into the html bot, and json to the application? Google won't ban? then it's simple:
  • site.com/users/top/?q=lamak
  • site.com/users/top/?q=lamak#overlay=ololo
  • site.com/users/top/?q=lamak#overlay=doesnt_matter_what

on the client via the history api as usual.
Thank you =)

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question