D
D
Dmitry Matveev2017-01-16 23:39:19
Python
Dmitry Matveev, 2017-01-16 23:39:19

Site parsing. How to correctly implement many successful requests?

Hello, I need to parse the site. There are many links on this site, and each link displays full information when clicked. The data changes every second on the site. But the update interval of 7 seconds is enough for me.
That is, if the site has 100 links. Then I need to go through each link to get all the information. And so every 7 seconds. That is, the number of requests is very large. And the server blocks me after some long period of time (30-40 minutes). I understand that I'm doing a bad thing, but I want to finish the job :)
I see several ways to solve this problem:
1. Find the ability to download all the information once in one request (API, common page). I did not find this on this resource.
2. Use a proxy. There was a problem here, they work very slowly. And they need, in theory, more than 10 for such a quantity. There is an idea to purchase servers and use them as a proxy
Proxy is the most profitable option, as it seems to me. Only I can not implement to be updated from the site every certain time. Could you please help me? Maybe there are other ways to solve this problem. And if a proxy is the only option. How can I better implement this algorithm and in general what to read on this topic. Thank you! :) I use Python 3

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question