Answer the question
In order to leave comments, you need to log in
How smart web crawler works?
People, hello! There was a task to write on php || python server web-crawler that will surf the web, collect links, just all the links it finds.
It became interesting how this is implemented, if we stupidly download pages and pull links regularly, it will be so-so, frankly, since the site can load all links via ajax (page body). Or there are sites with infinite loops that kill like software (when you go to the site, a working link is automatically generated that leads to the site with the same dynamically generated link, and so on ad infinitum). Can you advise a ready-made solution, or explain how best to do it all? thanks))))
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question