S
S
Sabrjkee2019-04-15 12:44:43
Django
Sabrjkee, 2019-04-15 12:44:43

How to take records from a model one by one in django without using .all()?

I need to create a function that will run in celery and in turn get records from a model, validate something and write data to another model with a onetoone relation. There are a lot of records, and using model_name.objects.all() and then iterating over it won't work (it will take a lot of memory and time), as it can be done.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
V
Vladimir Kuts, 2019-04-15
@Sabrjkee

Add a boolean field to the first model, like is_processed.
Do a selection on it .filter(is_processed=False), getting a portion of data, for example .first() or doing a slice, and setting a flag after processing for processed instances.
Repeat the previous step until all flags are set.

R
Roman Kitaev, 2019-04-15
@deliro

There is .iterator(), but using it is (almost) useless without using server-side cursors.
Therefore, it is easier to get slices. If there is a guarantee that records are not created during the task, then these are just slices: , , ... If there is no guarantee, then you need to come up with filtering. For example: First sample - Second and all following -Model.objects.all()[0:200]Model.objects.all()[200:400]

Model.objects.filter(id__lt=<последний id из предыдущей выборки>).order_by("-id")[:200]

V
Vladimir, 2019-04-15
@vintello

one task in celery runs through all records and, if necessary, creates many small tasks in the same celery, that is,
your task simply turns into a dispatcher task, and all the logic for a particular record fits into this small task. it is desirable to separate these small tasks into a separate worker, with a large concurency value

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question