S
S
Sergey c0re2021-06-06 11:58:03
Oracle
Sergey c0re, 2021-06-06 11:58:03

How to correctly split a string into substrings by delimiter on multiple strings (by dataset)?

I don’t understand something ...

Everything is simple with one line, it is broken through CONNECT BY, for example

WITH 
  test AS (
SELECT 'ул.' as code, 'ул.|улица|ул-ца|' as str FROM dual
)
SELECT t1.code, SUBSTR(t1.str, 
              DECODE(level, 1, 1, INSTR(t1.str, '|', 1, level - 1) + 1),
              INSTR(t1.str, '|', 1, level) 
              - DECODE(level, 1, 1, INSTR(t1.str, '|', 1, level - 1) + 1))
FROM test t1
CONNECT BY NVL(INSTR(t1.str, '|', 1, level), 0) <> 0
;


but I can't figure out how to do it on a lot of similar lines, they all "multiply" on each other ...

WITH 
  test AS (
SELECT 'проул.' as code, 'проул.|пр-к|' as str FROM dual UNION
SELECT 'ул.' as code, 'ул.|улица|ул-ца|' as str FROM dual UNION
SELECT 'пер.' as code, 'пер.|переулок|' as str FROM dual UNION
SELECT 'кв-л' as code, 'кв-л|квартал|' as str FROM dual UNION
SELECT 'линия' as code, 'линия|' as str FROM dual UNION
SELECT 'парк' as code, 'парк|' as str FROM dual UNION
SELECT 'рзд' as code, 'рзд|разъезд|' as str FROM dual UNION
SELECT 'сад' as code, 'Сад|' as str FROM dual UNION
SELECT 'тракт' as code, 'тракт|' as str FROM dual UNION
SELECT 'тер' as code, 'тер|территория|' as str FROM dual UNION
SELECT 'ст' as code, 'ст|станция|' as str FROM dual UNION
SELECT 'сл' as code, 'сл|слобода|' as str FROM dual UNION
SELECT 'пр-кт' as code, 'пр.|проспект|пр-кт|' as str FROM dual UNION
SELECT 'шоссе' as code, 'ш.|шоссе|' as str FROM dual UNION
SELECT 'пл' as code, 'пл|площадь|пл-дь|' as str FROM dual UNION
SELECT 'наб.' as code, 'наб.|набережная|наб-ая|' as str FROM dual UNION
SELECT 'тупик' as code, 'тупик|' as str FROM dual UNION
SELECT 'мост' as code, 'мост|' as str FROM dual UNION
SELECT 'пр.' as code, 'проезд|пр.|' as str FROM dual UNION
SELECT 'блв.' as code, 'блв.|бульвар|' as str FROM dual UNION
SELECT 'мкн.' as code, 'мкн.|микрорайон|' as str FROM dual UNION
SELECT 'пос.' as code, 'пос.|посёлок|' as str FROM dual UNION
SELECT 'овраг' as code, 'овраг|' as str FROM dual UNION
SELECT 'городок' as code, 'городок|' as str FROM dual
)
SELECT t1.code, SUBSTR(t1.str, 
              DECODE(level, 1, 1, INSTR(t1.str, '|', 1, level - 1) + 1),
              INSTR(t1.str, '|', 1, level) 
              - DECODE(level, 1, 1, INSTR(t1.str, '|', 1, level - 1) + 1))
FROM test t1
CONNECT BY NVL(INSTR(t1.str, '|', 1, level), 0) <> 0
;


but it is necessary that it be as in the first case, for each code, respectively, broken lines, and not multiplication ...

PS: of course, DISTINCT helps, but for some reason it seems to me that this is not entirely correct ??

Answer the question

In order to leave comments, you need to log in

1 answer(s)
M
Maxim Y, 2021-06-11
@erge

Hello.
It is necessary to stop connect by in time.

SELECT
       t1.code
      ,regexp_substr(t1.str, '[^\|]+', 1, lvl) AS str_splitted
  FROM test t1
 CROSS JOIN LATERAL (
       SELECT level AS lvl
         FROM dual
      CONNECT BY level <= regexp_count(t1.str, '\|') 
 ) t2

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question