Answer the question
In order to leave comments, you need to log in
How to store and select such data?
There are a large number of records that, among other fields, have, for example, a list of labels, a date, and a status. When working with this data, the user first filters it by all fields except date and status. A filter can return as few records as possible, or at least all of them. After that, the filtered data should be quickly pulled out by status and date range.
The problem is that the filter can be very complex and long to execute (for example, records that contain 20 tags, do not contain 20 other tags and contain some word in the title), that is, you can’t put it in every request. Accordingly, it is necessary to cache the list of filtered records, and this cache itself should be indexed by date/status for subsequent fetching. What tools (DB) would be used to implement this effectively? SQL, NoSQL, I basically don't care now.
Answer the question
In order to leave comments, you need to log in
There is a similar mechanism in one CMS, there are a lot of joins with large volumes, and according to some versions, the speed issue is solved quite well by moving indexes to Sphinx, I don’t know how much this will help you, but I think it’s worth google as an option.
Build your indexes on ranges of values, taking into account your knowledge of the distributions of your data and possible queries, so that you immediately cut off the maximum. Once upon a time it was very helpful for searching through all LiveJournal users marked up with a lot of parameters.
A filter can return as few records as possible, or at least all of them. After that, the filtered data should be quickly pulled out by status and date range.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question