What is the best way to organize such a system?

A

Alexey Laud2017-12-11 02:40:18

MySQL

Alexey Laud, 2017-12-11 02:40:18

Faced with the fact that the main section of the service has recently (contributed to the growth of the audience) loads the processor by 100%. Perhaps there are other options for organizing this section or other technologies that would work better.
Task:
Show the user ideas (posts) that he has not yet seen and whose tags fit his filter. The filter is a list of tags, some of which may contain characters that change their behavior. For example, tags that should not be in ideas, tags that should be in one idea at the same time, tags that must be in every idea. Tags containing the * symbol cover all options matching the mask at once.
Ideas should be sorted by the date of the last restart (transferring the active field from 0 to 1 raises the idea to the top).
Now done like this:
Mysql. The ideas, ideas_tags, ideas_index, and ideas_seen tables.
The query looks something like this:
select * from ideas where
id in (select idea_id from ideas_tags where tag_id in (123,123,13342,232,133,82))
and id not in (select idea_id ...)
and ideas.active=1
and id not in (select idea_id from ideas_seen where user_id=182736)
order by (select id from ideas_index where idea_id=ideas.id) desc
limit 10
Sometimes filters by idea length are also added.
Are there any options on how to implement the same functionality in an easier way for the server?

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

K

kn0ckn0ck, 2017-12-11
@kn0ckn0ck

Optionally, the logic of a query affects its performance. To a greater extent, this can be influenced by the specifics of the data, the physics of data placement, etc.
If you are sure that the problem is in this query, then you need to analyze not the query, but its execution plan (explain ...). Nothing meaningful can be said without a plan. One can only offer abstract ideas for optimization:
1. thoughtful indexes
2. denormalization
3. materialized views
4. partitioning
5. engine types

B

Boris Korobkov, 2017-12-11
@BorisKorobkov

order by (select id from ideas_index where idea_id=ideas.id) desc

At a minimum, ideas_index should be removed from the query, and a new sort (denormalize) field should be added to the ideas table.

H

hx510b, 2018-06-02
@hx510b

Apart from the suggestions above. I would optimize where
Conditions are simple and efficient in terms of indexes move up, then inefficiently increasing.
After all, with each new condition, the sample size decreases, which means the server will have to do fewer iterations,
then I would reduce id in and id not in to two, it will be faster than digging the sample several times (especially if it is large or does not fit into memory at all) for list comparisons. Those. At first we prepare lists, then we compare once.
Of course, the server itself must be optimized for such requests in order to have enough allocated memory.