C
C
Caretaker2019-06-07 08:25:18
MongoDB
Caretaker, 2019-06-07 08:25:18

Why is a selection with a search on a nested field with reverse sort slowing down?

Greetings.
There is a relatively small database of 2.5 million documents. The structure of the documents is not very important, the more it is "dynamic". But there is a query that is executed regularly and quite often - selecting the "most-most" and by the condition of the absence of a nested field:

db.getCollection('special').find({
    'checker.updated': { '$exists': false }
}).sort({'postedNum':-1}).limit(1)

So the thing is that this query on the database takes about 40 seconds ... Whereas this:
db.getCollection('special').find({
    'checker.updated': { '$exists': false }
}).sort({'postedNum':1}).limit(1)

spits out the result in less than one.
Tried to chemicalize indexes:
- separately for the postedNum:-1 and checker.updated:1 fields
- did double postedNum:-1 + checker.updated:1
- tried to use the index for the whole checker
The result is zero. If you sort in ascending order - instantly, as soon as the reverse in descending order - "kurimbambuk". There were even thoughts about whether to blame something relational, but there it stops the speed of active work with tables larger than 20-25 million records...
Update 1.I tried to replace the filter from a nested field with a simple one (I created a copy from 'checker.updated' => 'checkerUpdated') - everything also began to "fly" with both direct and reverse sorting. I don’t understand anything ... As a solution, of course, completely abandon the “nested structure”, but damn it, it’s to shovel half of the code and put together a bunch of blocks that will have to “fold / expand” working structures into single-level lists of
Update 2 fields.Now I don't understand at all. I cut down all the processes / nodes that were executing the same request in parallel with this "reverse sort" (left the conditions, removed the sorting - which is not good, but as a temporary option it will do), and immediately the wild delays for doing this went away of the same query from 40s+ to 2s... It turns out that such wild delays are caused by many parallel accesses to the database with the same query, but in combination "sorting + filter by nested field"... What the hell?? ?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
P
Philipp, 2019-06-09
@zuart

A good place to start would be to look at the output of explain .
The performance of the application depends on the index used, but in this case you are using $exists: false, which automatically leads to a full collection scan situation.
Indexing such a field is not an option. If you want to index, then the value must be in the index .
To solve the problem, you need to add something like updated: false.
If you store a date there, then use an additional field updatedAt. You should not mix different data types in one field - monga may not build an index.
For reverse sorting, you can create an additional index with reverse order.
You can read more about sorts here https://docs.mongodb.com/manual/tutorial/sort-resu...
You can build a combined index with a forward pass through a property and a reverse sort, just be sure to follow the order.

L
Leonid, 2019-06-07
@caballero

because nosql databases, fashionable until recently, are not designed for complex and at the same time fast selections.
This is not your last problem - the sooner you switch to a normal relational database, the fewer problems in the future.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question