D
D
denisigo2010-10-14 17:46:13
SQL
denisigo, 2010-10-14 17:46:13

The organization of storage of structure of categories in a relational DB?

The task is to organize the storage of a certain catalog, with a fairly branched structure (tree) - let it be a product catalog of an online store. Only a URI like "/category/subcategory/another-category/and-one-more-category" is available to search for an element. The maximum nesting is about 10.
Categories are requested frequently, rarely change, the total number of categories can be about 100 thousand.
Fast generation of breadcrumbs is also required. Moreover, the link to the category (“and-one-more-category”) may differ from its title (“And one more category”), which is used for display on the page.
So far I have one proposed solution - "on the forehead" - in the footsteps of Materialized path :
the table for categories has the following structure
CREATE TABLE categories (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
`title` VARCHAR(50) NOT NULL,
`link` VARCHAR(50) NOT NULL,
`path` VARCHAR(1000) NOT NULL,
`title_path` VARCHAR(1000 ) NOT NULL
)
CREATE INDEX path_indx ON categories (`path`);
`title` - category title ("And one more category"),
`link` - category link ("and-one-more-category"),
`path` - category path ("category/subcategory/another-category/ and-one-more-category"),
`title_path` - the same as `path`, only contains the titles of the corresponding categories - for quick breadcrumb generation
- The attraction is that no effort is needed to search for a category - just SELECT… WHERE path LIKE…
- Even the need to rebuild paths in case of moving / renaming nodes does not scare.
- It scares the redundancy of the approach and the likely size of the table with a large number of categories. How much will it affect speed?
- It is also confusing that such a long string in `path` is used as a search key (although I very much doubt that it will ever go beyond 100 characters)
Can I put `path` and `title_path` in a separate table? So all the same, the path and breadcrumbs for the category are almost always required, so you have to join ...
I look towards Full hierarchy , but again, the possible redundancy in the hierarchy table confuses, especially considering the potential number of categories and nesting levels.
How best to solve the problem?

Answer the question

In order to leave comments, you need to log in

4 answer(s)
A
Alex Shinkevich, 2010-10-14
@aleXoid

Maybe it's worth just caching everything in MemcacheDB and changing cache entries when rebuilding?
The path key
Inside is an array with a category string from the database + an array for breadcrumbs

A
Alexey Zhurbitsky, 2010-10-14
@blo

should it be possible to change the parent category, for example when editing a subcategory (category/subcategory/… to category/subcategory-1/)? If not, then perhaps your option is suitable. If you need to provide for this possibility, and indeed have a more flexible structure, I advise you to google nested sets

W
WebByte, 2010-10-15
@WebByte

100,000 categories is 5 characters per node in the path. 10*5 is the maximum path per category.
Total maximum 5 megabytes of data. In reality, much less.
Not the size to worry about.
Regarding CRC32 and md5
Firstly, md5 is 32 characters per hash in hex representation, and less in Base64 representation.
But compared to the maximum of 50 characters, some dubious win, save a penny.
Secondly, how then are you going to use LIKE for searching?
md5("abc") is not like concat(md5("ab"), '%')
Conclusion: do it and don't worry about sizes.

S
Sergey, 2010-10-18
@liaren

I can advise you to use a mixed approach, like DaBase. See implementation .
Those. there it is used as a Nested Sets principle (which speeds up the selection of child elements), and each node also has a place to have parent_id and level parameters.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question