How to store and select such data?

D

deleted-mifki2013-04-02 06:45:15

SQL

deleted-mifki, 2013-04-02 06:45:15

There are a large number of records that, among other fields, have, for example, a list of labels, a date, and a status. When working with this data, the user first filters it by all fields except date and status. A filter can return as few records as possible, or at least all of them. After that, the filtered data should be quickly pulled out by status and date range.

The problem is that the filter can be very complex and long to execute (for example, records that contain 20 tags, do not contain 20 other tags and contain some word in the title), that is, you can’t put it in every request. Accordingly, it is necessary to cache the list of filtered records, and this cache itself should be indexed by date/status for subsequent fetching. What tools (DB) would be used to implement this effectively? SQL, NoSQL, I basically don't care now.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

S

Sergey Cherepanov, 2013-04-03
@fear86

There is a similar mechanism in one CMS, there are a lot of joins with large volumes, and according to some versions, the speed issue is solved quite well by moving indexes to Sphinx, I don’t know how much this will help you, but I think it’s worth google as an option.

D

Daedmen, 2013-04-03
@Daedmen

Build your indexes on ranges of values, taking into account your knowledge of the distributions of your data and possible queries, so that you immediately cut off the maximum. Once upon a time it was very helpful for searching through all LiveJournal users marked up with a lot of parameters.

M

Maxim, 2013-04-03
@mgramin

A filter can return as few records as possible, or at least all of them. After that, the filtered data should be quickly pulled out by status and date range.

Those. in any case, a certain limited set of data is returned to the user, right? In this case, you can start with relational and competent indexing.