Answer the question
In order to leave comments, you need to log in
How to improve the parser?
Good afternoon!
Can you suggest how you can modify the code so that it does not give an error every other time in the soups list, the description of which I added after the code...
soups = []
list_names = ['Александр', 'Иван']
for name in tqdm(list_names):
for number in tqdm(range(2,4)):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36'}
p = {'searchQuery' : name,
'page' : number}
r = requests.get(f'https://cs/agents', params = p, headers=headers, cookies={'abc': 'all', 'count': '10'})
soups.append(r.text)
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
</body>
</html>
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
Answer the question
In order to leave comments, you need to log in
It is already the server swears. As options, I would look at
1. Put a delay between requests. I think that there is a lot of data on popular names and the server crashes
2. Add a banal level check . And if the status code is not 200, then send the request to the same page again until it responds normally.
3. To the second point, you can add the number of attempts, let's say 3 retries every 5-10 seconds. If after three attempts it didn’t work out, go to the next page (well, or to the next name) if r.status_code != 200:
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question