M
M
Mors Superberg2017-12-14 15:11:46
Web development
Mors Superberg, 2017-12-14 15:11:46

How smart web crawler works?

People, hello! There was a task to write on php || python server web-crawler that will surf the web, collect links, just all the links it finds.
It became interesting how this is implemented, if we stupidly download pages and pull links regularly, it will be so-so, frankly, since the site can load all links via ajax (page body). Or there are sites with infinite loops that kill like software (when you go to the site, a working link is automatically generated that leads to the site with the same dynamically generated link, and so on ad infinitum). Can you advise a ready-made solution, or explain how best to do it all? thanks))))

Answer the question

In order to leave comments, you need to log in

1 answer(s)
F
Fixid, 2017-12-14
@NooooN

Selenium, and then write your logic

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question