S
S
Sazoks2019-05-11 13:40:17
Python
Sazoks, 2019-05-11 13:40:17

Can Google block requests due to their large number (Python Parser)?

Hello. I didn't know Python at all 2 weeks ago.
There was an urgent need to write your own email parser) During this period, I had to master the basics of Python and the corresponding bibles. The essence of the parser is as follows:
1) There is a file with a fairly large number of cities in Russia
2) The program reads the file line by line (each city in its own line) and makes the request 'website development in ' + city
3) Google search results parse each site separately for the presence of soap and saves it to a file
4) Further, the same with the next city
But! When testing the program, I noticed that it sometimes misses a city. Those. in the console is the name of the city, and below all the emails on request. And it happens that just the name of the city, and the bottom is empty, and then the next city. So I had a question, could this be due to blocking, due to too many requests?
Thanks

Answer the question

In order to leave comments, you need to log in

2 answer(s)
I
Ivan Shumov, 2019-05-11
@Sazoks

Google not only can, but does. But in your particular case, you should pay attention to the headers and the response code. For parsing large volumes, proxies are generally used.

S
SagePtr, 2019-05-11
@SagePtr

Yes, with a large number of requests from one IP, Google starts issuing captcha when searching. You can make sure on the Opera browser with a VPN, there, due to the large number of users through the same VPN, captcha on Google is almost eternal)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question