Answer the question
In order to leave comments, you need to log in
What is the best way to organize such a system?
Faced with the fact that the main section of the service has recently (contributed to the growth of the audience) loads the processor by 100%. Perhaps there are other options for organizing this section or other technologies that would work better.
Task:
Show the user ideas (posts) that he has not yet seen and whose tags fit his filter. The filter is a list of tags, some of which may contain characters that change their behavior. For example, tags that should not be in ideas, tags that should be in one idea at the same time, tags that must be in every idea. Tags containing the * symbol cover all options matching the mask at once.
Ideas should be sorted by the date of the last restart (transferring the active field from 0 to 1 raises the idea to the top).
Now done like this:
Mysql. The ideas, ideas_tags, ideas_index, and ideas_seen tables.
The query looks something like this:
select * from ideas where
id in (select idea_id from ideas_tags where tag_id in (123,123,13342,232,133,82))
and id not in (select idea_id ...)
and ideas.active=1
and id not in (select idea_id from ideas_seen where user_id=182736)
order by (select id from ideas_index where idea_id=ideas.id) desc
limit 10
Sometimes filters by idea length are also added.
Are there any options on how to implement the same functionality in an easier way for the server?
Answer the question
In order to leave comments, you need to log in
Optionally, the logic of a query affects its performance. To a greater extent, this can be influenced by the specifics of the data, the physics of data placement, etc.
If you are sure that the problem is in this query, then you need to analyze not the query, but its execution plan (explain ...). Nothing meaningful can be said without a plan. One can only offer abstract ideas for optimization:
1. thoughtful indexes
2. denormalization
3. materialized views
4. partitioning
5. engine types
order by (select id from ideas_index where idea_id=ideas.id) desc
Apart from the suggestions above. I would optimize where
Conditions are simple and efficient in terms of indexes move up, then inefficiently increasing.
After all, with each new condition, the sample size decreases, which means the server will have to do fewer iterations,
then I would reduce id in and id not in to two, it will be faster than digging the sample several times (especially if it is large or does not fit into memory at all) for list comparisons. Those. At first we prepare lists, then we compare once.
Of course, the server itself must be optimized for such requests in order to have enough allocated memory.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question