E
E
eliastro2015-04-23 12:50:51
Database
eliastro, 2015-04-23 12:50:51

How to organize the storage of replacement products for search optimization?

Good afternoon!
There are elements (goods) that can be identical replacements for each other. For example, there is a product A, for it there are replacements B and C. When searching for A, its replacements - B and C should be displayed. Further, replacements D and E can be set for product B. When searching for A, then B, C, D, E should be displayed . All substitutions are identically equal to each other. Accordingly, when searching for C, A, B, D, E should be displayed.
Now storage of replacements is organized by the table with 2 fields.
A - B
A - C
B - E
B - D.
The current search algorithm is reduced to recursive execution of search queries to the database. For example, for A, its replacements are first searched for. Then, for the found replacements, their replacements are searched for, and so on. Then everything is combined and the final result of the search is displayed. This is very suboptimal, because for each replacement found, another request is made, and so on. Those. a bunch of requests, but you need to get the necessary data in one request.
Tried data denormalization, i.e. store all combinations:
A - B
A - C
A - E
A - D
B - A
B - C
B- D
B -E
C -A
...
But then the table grows to a huge size - hundreds of millions of rows (in our conditions). Plus, there is another problem, if you remove the replacement relationship, for example, between A and B, then it turns out that E and D also cease to be replacements for A and C, how then to remove these relationships in a denormalized table is not clear.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
V
Valery Ryaboshapko, 2015-04-23
@eliastro

If the replacements are bidirectional regardless of the level, that is (in your example) A is a replacement for E, then you can simply enter one more field that will be the same for all products in one replacement group. Something like a many-to-one relationship, but without an additional table. Although no one bothers to introduce an additional table to give a name to the replacement group.
Well and, by itself, an index on this additional field.

R
Rsa97, 2015-04-23
@Rsa97

Here you need to look at how often the list of replacements changes. If not often, then keep two tables - the main set, as you have now, and a cached list of the form ('A' - 'B,C,D,E'), rebuilding it when the main set changes.

S
sim3x, 2015-04-23
@sim3x

But then the table grows to a huge size - hundreds of millions of rows (in our conditions).
and fine. All the same, only three ints are stored there, right?
When denormalizing, you will have to attach a trigger to update of any product that will update the cached value.
When deleting a product, it will shovel all products in which this product is a substitute

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question