How to understand the reason for fetch behavior in MongoDB?

C

Caretaker2019-06-16 23:22:51

MongoDB

Caretaker, 2019-06-16 23:22:51

Greetings. Input data:
- a collection of about 11.5 million records
- each document has a string format field, which can either have a string or an empty value
- each document has a numeric field from 0 to 100, which is used in sorts
Query 1 (1.74 sec) = 5971171:

db.getCollection('channels').find(
    { 'update': { $gt: '' }}
).count();

Request 2 (1.74 sec) = 5697946:

db.getCollection('channels').find(
    { 'update': { $eq: '' }}
).count();

Relatively close results, i.e. the entire collection is essentially divided approximately in half - empty / non-empty values. Next, I want to select documents with the highest sort index from these two groups:
Query by "not empty" (0.005 sec):

db.getCollection('channels').find(
    { 'update': { $gt: '' }}
).sort(
    { 'sort': -1 }
).limit(1);

Request for "empty" (12.4 sec):

db.getCollection('channels').find(
    { 'update': { $eq: '' }}
).sort(
    { 'sort': -1 }
).limit(1);

Indices:

_sort_ {
    "sort" : -1
}
_update_ {
    "update" : 1
}

What is the reason for this runtime difference? Especially when you consider that the index on the "update" field is increasing, and empty values "" should be at the beginning of the index, which means it's much faster to get out...

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

P

Philipp, 2019-06-17
@zoonman

Write -1 where there is no value.
Use {update: -1}to search for "blank" lines.
What does explain say?