A
A
Andrew2015-04-14 14:35:10
Django
Andrew, 2015-04-14 14:35:10

How to optimize/speed up QuerySet sorting?

Good day!
There are 2 models - custom User and Interest. Each user has a certain number of interests. When a user navigates to a particular page, they are prompted to connect with other users with whom they have the most common interests.

class Interest(models.Model):
        name = models.CharField(max_length=50, unique=True)
        users = models.ManyToManyField(settings.AUTH_USER_MODEL,
                                       related_name="interests", blank=True)

For getting and sorting, we use the following. method:
def suggested_people(user):
        queryset = User.objects.custom_filter(is_verified=True).order_by('-date_joined')
        users_sorted = sorted(queryset, key=lambda x: x.get_common_interest(user).count(), reverse=True)
        return users_sorted

User class instance method:
def get_common_interest(self, user):
        """ Return a list of string with the interests and the total number remaining """
        your_interests = user.interests.values_list('pk', flat=True)
        return self.interests.filter(pk__in=your_interests)

The problem is that the list of received objects is sorted very slowly (about 8s for 1000 users). What are the possible ways to solve this problem? I would be grateful for any advice!

Answer the question

In order to leave comments, you need to log in

3 answer(s)
A
Andrew, 2015-04-14
@aphex

Decision:

from django.db.models import Count

interests_ids = u.interests.values_list('id', flat=True) # select ids of incoming user interests
suggestions = User.objects
                  .exclude(id=u.id) # exclude current user
                  .filter(is_verified=True) # filter only verified users
                  .filter(interests__id__in=interests_ids) # select users based on common interests
                  .annotate(interests_count=Count('interests')) # count numbers of interests for each user after filtering
                  .order_by('-interests_count') # order users by max common interests

V
Vadim Shandrinov, 2015-04-14
@suguby

Well you on each user does a subquery - wildly slowly. You can view all database queries from the debug toolbar habrahabr.ru/post/50221 (you will see a LOT of interesting things)
and in order not to pull in subqueries, use prefetch_related

queryset = User.objects.custom_filter(is_verified=True).prefetch_related('interects').order_by('-date_joined')

D
Dmitry Grebenshchikov, 2015-04-14
@iMeath

You can select only the data you need. Significantly speeds up sampling
I highly recommend reading the article on Habré
. In addition, do not forget to cache

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question