M
M
maksam072021-04-17 16:20:52
Django
maksam07, 2021-04-17 16:20:52

How to make infinite multi-threaded data parsing?

Good afternoon! I am writing a site in django, some kind of analytics for dozens of sites. And I had tasks:
1. How to do multi-threaded data parsing by url? Suppose there are 100 of them. I read about multiprocessing Pool (+requests, +BeautifulSoup) and even already implemented a parsing option with it, but I'm interested in the opinion of experts on how to do it correctly.
2. After the end of the parsing (task 1), I need the task to immediately start working again and this has always continued while the site / server is working. Worked only with cron, but it will not work there
2.2. Perhaps, if the task was completed too quickly, for example, within 2 seconds, then make a block so that the new task does not start earlier than 10 seconds after the previous one was launched. In theory, this is done on the server side, but suddenly in the solution of problem 2 there will be some kind of native setting with restrictions

Answer the question

In order to leave comments, you need to log in

2 answer(s)
S
Sergey Gornostaev, 2021-04-17
@maksam07

Celery

V
Vladimir Korotenko, 2021-04-17
@firedragon

Any queue. Only in the task code put the addition of the task again.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question