Answer the question
In order to leave comments, you need to log in
How to take records from a model one by one in django without using .all()?
I need to create a function that will run in celery and in turn get records from a model, validate something and write data to another model with a onetoone relation. There are a lot of records, and using model_name.objects.all() and then iterating over it won't work (it will take a lot of memory and time), as it can be done.
Answer the question
In order to leave comments, you need to log in
Add a boolean field to the first model, like is_processed.
Do a selection on it .filter(is_processed=False), getting a portion of data, for example .first() or doing a slice, and setting a flag after processing for processed instances.
Repeat the previous step until all flags are set.
There is .iterator(), but using it is (almost) useless without using server-side cursors.
Therefore, it is easier to get slices. If there is a guarantee that records are not created during the task, then these are just slices: , , ...
If there is no guarantee, then you need to come up with filtering. For example:
First sample -
Second and all following -Model.objects.all()[0:200]
Model.objects.all()[200:400]
Model.objects.filter(id__lt=<последний id из предыдущей выборки>).order_by("-id")[:200]
one task in celery runs through all records and, if necessary, creates many small tasks in the same celery, that is,
your task simply turns into a dispatcher task, and all the logic for a particular record fits into this small task. it is desirable to separate these small tasks into a separate worker, with a large concurency value
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question