How to prevent concurrent (same) incoming requests on the API side?

S

Sergey Pashkevich2021-02-01 11:42:35

Laravel

Sergey Pashkevich, 2021-02-01 11:42:35

Colleagues, hello everyone.
Caught a problem when synchronizing data between different sites.

There is a site A from which data on checks are collected and sent to site B in batches of 1000 pieces sequentially . On site A, they blocked the button from pressing again, etc., but still something happened that duplicate requests arrived at site B at the same time.

On site B, we are waiting for incoming requests with bundles of checks of 1000, we check each check separately that it does not exist in the database for the database site, and after checking we save them to the database. And we are waiting for the next pack of checks. Processing one batch takes ~ 30 seconds, everything worked well for a year, but something happened that requests came at the same time and the data was duplicated, since the first request did not have time to put the data in the database and because of this, the second request was validated with a duplicate information.

How can such problems be solved on the API side?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

S

Sergey Sokolov, 2021-02-01
@siarheipashkevich

Laravel is beautiful and thoughtful) It also has a blank for organizing the Task Queue .
In general, your task looks like a transformation from possibly parallel requests to serial ones.
A task queue with a single worker is fine for this.
Create a new Job via artisan and transfer the request processing logic to it. When an api request comes in, just create a new task and return an instant "OK, accepted" response.
The worker will constantly work: to process a long-playing pack or wait for a new task to arrive. Definitely will not take 2 in parallel.
ps Why is it taking so long to process 1000 checks? The basis brakes at an insertion? Indexes are not the same?

Q

Quadrollionaire, 2021-02-01
@Quadrollionaire

I do not think that this is the best solution, but as an option.
1) Calculate the hash (either check IDs or something else, but the main thing is that the request can be uniqueized)
2) Add it all to the radish, rabbit, kafka can also be achieved due to the correct config)
3) Create handlers that read the queue and do all the work
PS) You can do without a hash by simply writing processed checks in the radish for the last half a day, for example, and when a request arrives, see which ones have already been processed, they are not add to the queue, and those that are new - let them go on their way. In this case, you don’t need to unify anything (less crap)