How to properly implement pagination in MongoDB with sorting by compound index?

A

Alexey2014-02-06 02:14:42

MongoDB

Alexey, 2014-02-06 02:14:42

The question has a more abstract and theoretical character, even more likely not a question, but a reasoning on the topic.

TLTR
How to properly implement pagination in MongoDB with sorting by compound index, as well as sampling by a heap (~40 booleans, numbers, strings, arrays) of keys? Estimated collection size 500K - 1M.
I know about "nimble" skip()and ranged queries , but by the will of fate they are mutually exclusive:
- the query must be processed as quickly as possible (maximum ~ 5ms per chamber) - goodbye skip()
- the result must have persistence (users can share the results) - goodbye ranged queries
In the first case , ironically, but the most logical and "correct" way has "small" performance problems directly dependent on the number of "skipped" records.

// skip()
db.users
  .find({
    make: 'Mercedes-Benz',
    model: 'S 500/550'
    state: 'sale',
    arr: { $in: [ 'used', 'notDamaged' ] }
    mileage: { $lte: 80000 }
  })
  .sort({ priority: 1, age: -1, price: 1 })
  .skip(200) // уже показанные результаты
  .limit(25);

In the second case, the result is extremely "fragile" - the field on which the very "gap" depends. Deleting a document, editing one of the fields involved in the selection, changing the state that affects the display in the results - all these factors are "deadly" in the matter of persistence.
Example:

// Ranged query
db.users
  .find({
    _id: { $gt: lastId }, // _id последнего документа в последнем результате
    make: 'Mercedes-Benz',
    model: 'S 500/550'
    state: 'sale',
    arr: { $in: [ 'used', 'notDamaged' ] }
    mileage: { $lte: 80000 }
  })
  .sort({ priority: 1, age: -1, price: 1 })
  .limit(25);

From the expected outputs and these ~~disappointments~~ of limitations, I singled out a couple of the most real ones:
- building your own, usually "heavy", index for each type of query (the type can be sorting, key keys, etc.) using Redis / Memcached'a. First of all, a "heavy" Map / Reduce with incremental (it depends on the developer - desire / knowledge / skills) restructuring of everything and everyone comes to mind.
- use of a different technology (in my case, a different NoSQL database) - more on that below.
The most realistic way out for myself was to switch to another NoSQL database. There are quite a decent number of them, but in terms of properties comparable to MongoDB I stopped at 3:
- CouchDB- Map/Reduce and Views, with a properly built index is a very smart thing, pagination is not a problem.
- Couchbase Server - the brainchild of the natives of the CouchDB project, faster than it, open-source which is not open-source (new versions are "late").
- Riak - I can't say anything yet, but most of the developers I know have an extremely positive opinion about it.
- ~~RethinkDB~~ - 4th, dropped option, it is also a ugly duckling. Having its own, very convenient query language, it is inferior to the top three in performance, bearing this burden even for quite a long time, and so far there is no need to wait for a glimpse - according to the developers, they are trying, but they are not really succeeding.