U
U
ugin_root2020-11-25 08:03:15
MySQL
ugin_root, 2020-11-25 08:03:15

How to get value by index from comma separated list ("5,9,12")?

I have a table like this:

CREATE TABLE `category` (
  `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `parent_id` int(10) UNSIGNED DEFAULT NULL,
  `name` varchar(255) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `parent_id` (`parent_id`),
  CONSTRAINT `parent` FOREIGN KEY (`parent_id`) REFERENCES `category` (`id`)
)


And there is a recursive query, in order to get a tree structure in one query:

WITH RECURSIVE category_path AS
(
  SELECT 
    c.parent_id, 
    c.id, 
    CONVERT(c.id, CHAR(1000)) AS ids
  FROM category AS c
  WHERE parent_id IS NULL
  
  UNION ALL
  
  SELECT 
    c.parent_id, 
    c.id, 
    CONCAT(p.ids, ',', c.id) AS ids
  FROM category AS c
    JOIN category_path AS p ON p.id = c.parent_id
)
SELECT 
  category_path.*
FROM category_path
ORDER BY category_path.ids


The result is this (dotted cut lines):

parent	id	ids
null	1	1
1	190	1,190
....
1	2	1,2
2	103	1,2,103
....
2	3	1,2,3
...


I need to sort this result by the ids field. In the result example, you can see the problems of such sorting:
  • It doesn't sort each "node" in order
  • Sorting is not numbers, but strings. Those. 190 will be higher than 2


MySQL has a FIND_IN_SET function that looks up the index of a specified node by its value.
I'm interested in the inverse function. Those. retrieve a value by its index. Is there an easy way to do this?
So far I have found 2 options to do this.

The first option is this:

SUBSTRING_INDEX(SUBSTRING_INDEX(ids, ',', 1), ',', -1) + 0 ASC,
SUBSTRING_INDEX(SUBSTRING_INDEX(ids, ',', 2), ',', -1) + 0 ASC,
SUBSTRING_INDEX(SUBSTRING_INDEX(ids, ',', 3), ',', -1) + 0 ASC,
#...


This option has a problem, for example, if for the string "1,2,103" you try to get the value at index 5, then it will return 103 and for any index greater than 3, the value of the last column will be returned.

I thought a little and made an option that does not have this problem:

REGEXP_REPLACE(category_path.ids, '^((([0-9]+,){0})([0-9]+)((,[0-9]+)*)|.*)$','\\4') + 0,
REGEXP_REPLACE(category_path.ids, '^((([0-9]+,){1})([0-9]+)((,[0-9]+)*)|.*)$','\\4') + 0,
REGEXP_REPLACE(category_path.ids, '^((([0-9]+,){2})([0-9]+)((,[0-9]+)*)|.*)$','\\4') + 0,
#...


But it's still too complicated and not obvious solution. uses regular expressions.
And he also has a disadvantage. To sort long "branches" you need to write dozens of fields in ORDER BY

While writing the question, I had an idea that solved my problem with sorting in another way.
For the ids column, I padded the id of each node with zeros up to 10 characters (yours may be different, it all depends on the type, for INT(10) UNSIGNED it is 10). And I increased the maximum string length for this field (CHAR(11000)), so it will work correctly for trees with a maximum nesting of up to 1000.

This is how the query turned out:

WITH RECURSIVE category_path AS
(
  SELECT 
    c.parent_id, 
    c.id,
    CONVERT(LPAD(c.id, 11, '0'), CHAR(11000)) AS ids
  FROM category AS c
  WHERE parent_id IS NULL
  
  UNION ALL
  
  SELECT 
    c.parent_id, 
    c.id, 
    CONCAT(p.ids, ',', LPAD(c.id, 11, '0')) AS ids
  FROM category AS c
    JOIN category_path AS p ON p.id = c.parent_id
)
SELECT category_path.*
FROM category_path
ORDER BY category_path.ids;


Now the result is sorted correctly, there is no need to write sortings for each node in order in the ids column. The only thing I'm not sure about is performance for large tables. Right now I have a total of 963 records in the category table and the query takes 100 milliseconds.

The result is this:

parent	id	ids
null	1	0000000001
1	2	0000000001,0000000002
2	3	0000000001,0000000002,0000000003
3	4	0000000001,0000000002,0000000003,0000000004
4	5	0000000001,0000000002,0000000003,0000000004,0000000005
2	103	0000000001,0000000002,0000000103
103	104	0000000001,0000000002,0000000103,0000000104
104	105	0000000001,0000000002,0000000103,0000000104,0000000105
104	427	0000000001,0000000002,0000000103,0000000104,0000000427
103	353	0000000001,0000000002,0000000103,0000000353
353	354	0000000001,0000000002,0000000103,0000000353,0000000354
103	653	0000000001,0000000002,0000000103,0000000653
2	219	0000000001,0000000002,0000000219
....


If someone has ideas on how to make the request easier or faster, or suddenly there is something that I did not take into account, then please write about it.

In principle, I solved my task with sorting, but the question of the simplest possible extraction of a value by index from a list of comma-separated values ​​remains open.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
B
BasiC2k, 2020-11-25
@BasiC2k

Perhaps you should try using IN in the query. For example:
SELECT * FROM Table WHERE Field IN (5,9,12)

H
Hakhagmon, 2016-10-11
@Hakhagmon

try replacing the mail() function with another one with settings, mb something on the server side

S
sweezy, 2016-10-11
@sweezy

It's strange, but the messages seem to come after some time, maybe there is some kind of limiter on the server?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question