Answer the question
In order to leave comments, you need to log in
How to check the existence of a url in multiple threads?
There is a simple code with which I check the validity of the page:
def check_url():
for page in range(0, 239999):
soup = BeautifulSoup(get_html(url + str(page)), 'html.parser')
if soup.find('h3', class_='description_404_A hide'):
print('Page not exists: {}'.format(url + str(page)))
else:
print('Page found: {}'.format(url + str(page)))
with open('pages.txt', 'a') as file:
file.write(url + str(page) + '\n')
Answer the question
In order to leave comments, you need to log in
many times it has already been said that horizontally scaling stateless handlers is to queues) RabbitMQ, for example.
And if you don’t want to take a steam bath, then take AWS SQS + AWS Lambda and get the processing of all this in .... I think it will do it in a couple of minutes) even freetier can meet
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question