K
K
Kotaro Hiba2020-09-18 15:59:59
Parsing
Kotaro Hiba, 2020-09-18 15:59:59

How can I do a task in multiple threads?

Hello, at the moment I am studying web scraping, I am faced with the fact that I need to process a large number of pages around 4k, the pages themselves are not difficult to parse, I do it through nightmare.js
But there is one problem, even with the condition that it takes about 2 pages per page -3s then it's ~ 150 minutes, which is not a lot.
The question arises how to properly run work in multiple threads?
It turns out there is an array with 4k links to pages
How to split this array into, say, 4 parts and simultaneously run the script for all arrays.

I have something like this in my head
Split the array into 4 parts, make the function asynchronous and call it for each array without waiting for a response, because the script simply adds data to the database after parsing the page.

I don’t know how correct this approach is, so I want to learn the implementation of the solution from those who understand this matter more

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
Vitaly, 2020-09-18
@vshvydky

pptp.dev
nightmare should not be used anymore, it was relevant 3 years ago.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question