N
N
Nikolay Progressov2019-04-26 13:40:18
go
Nikolay Progressov, 2019-04-26 13:40:18

Algorithm optimization advice?

Task: to get, filter and accurately store data from a large remote database via the passenger's API. The passenger gives the data page by page, there are 50 entries on one page, or less if the end is near. At the end of the data, we get the expected tags.
At first, I solved the problem in one stream mode like page++, the effectiveness of such a solution is about 23 hours. Due to independent factors, the database can be updated once a day, and this moment can affect consistency, and indeed, to be honest, for a long time.
I wrote a pool of 10 workers who solve the problem, the work from 23 hours turned into work from 1 to 3 hours. The essence of the pool is just a revolver, the master makes page ++ and the free worker picks up the task. Border found? all the following workers either learn in the process, complete and complete, or when waiting for the task, they will also be completed. Everything works fine.
The question is of what nature: this revolver, what do you think? can be further optimized?
The second question is: does it make sense to bother with calculating the boundaries, like, how many pages and so on? software is long-playing, just spitting out metrics is garbage, but to say how many out of how many.
Thank you!
PS: Sorry if there is no golang code here, but I can assure you everything is written in go, so examples will come in handy.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
B
bodrich, 2019-04-26
@bodrich

What is the approximate average number of records you get from the API / write to the table?
Is there a limit on the number of API requests?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question