M
M
mafaka1112021-09-03 17:22:21
Parsing
mafaka111, 2021-09-03 17:22:21

What can be improved to send 40000 requests per second?

Good afternoon.
The task is to parse a large amount of data. Parsing occurs by accessing the site's api.
It is necessary to make an average of 5-10 requests at the same time for each of the 4000 entities.

Now a node.js script is written, in which 100 child processes are created, each executes about 4000 asynchronous requests, respectively, and waits for their execution using Promise.all

The execution time of the entire script is about 15 seconds on a 6-core CPU. Server response time ~1s.
Is it possible to improve the speed of execution on current hardware, or can only horizontal scaling help? What kind of hardware is required to execute so many requests every 2 or 5 seconds?

Simplified code:

// main.js
const childProcess = require('child_process');
const entitiesChunks = splitToChunks(entities, 100); // Разделяет массив сущностей на 100 частей
for (let i = 0; i < 100; i++) {
    let workerProcess = childProcess.fork('entityWorker.js');
    workerProcess.send(entitiesChunks[i]);
}

// entityWorker.js
process.on('message', async (entities) => {
    // Проходит по всем эелементам и выполняет 5-10 асинхронных запросов внутри parseEntity()
    process.send(await Promise.all(entities.map((el) => parseEntity(el)))); 
    process.exit()
});

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Alexander Aksentiev, 2021-09-03
@mafaka111

Now a node.js script has been written, in which 100 child processes are created, each executes about 4000 asynchronous requests, respectively, and waits for their execution using Promise.all

what is the question if everything is written and works?
What is required to send 40000 requests per second?

it is necessary that there are enough threads-cores, the Internet channel and not banned.
How much depends on how much load the requests create and how long each request runs and processes the response.
From the creation of 100 child processes, the speed does not increase, all the same, as many as the processor pulls will be executed simultaneously, i.e., as a rule, 1 process = 1 processor core / thread.
Even if 1 core pulls several processes at the same time, the question of the Internet channel will arise, and then the processing of the received data (writing to the database / to hdd / RAM), it all eats time and it is unlikely that the server will be able to do 40k actions in a second.
Somewhere here there is an understanding that we need more than one server, but a whole cluster.
In general, it is necessary to measure and check how many requests are executed on how many resources in order to understand how much these resources need to be scaled.
There is no formula to answer questions such as "what hosting plan to choose if I have 10/100/1000 visitors on the site".

S
Saboteur, 2021-09-03
@saboteur_kiev

Actually, according to the competent one, it is necessary to expand the API to perform bulk requests ...
Or is this option not considered at all?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question