0

It is available column like 'ABCURLCBSURLDMSURLWER' in a table. URL repeats in that column. I want to retrieve the statement between two URL like below.

Column
------
CBS
DMS

I wrote query such as below but the query I wrote wasn't retrieved the result that I wanted.

SELECT 
REGEXP_SUBSTR((SELECT REPLACE(REPLACE('ABCURLCBSURLDMSURLALI','URL',','),'ABC',',') AS AB FROM 
DUAL),'[^,]+',1,LEVEL) AS AB
FROM
DUAL
CONNECT BY
REGEXP_SUBSTR((SELECT REPLACE(REPLACE('ABCURLCBSURLDMSURLWER','URL',','),'ABC',',') AS AB FROM 
DUAL),'[^,]+',1,LEVEL)
IS NOT NULL;

AB
---
CBS
DMS
WER

How can i fix this query?

2
  • Just wonder, how comes URLs are stored that way? Commented Sep 25, 2020 at 12:06
  • I had to use something repetitive to give an example, so I used it :)). Commented Sep 25, 2020 at 12:33

3 Answers 3

1

Try a query like this. Copying your solution to replace the URL text with a , makes the regex much simpler to split the string.

Updated Query

WITH some_data (ab) AS (SELECT 'ABCURLCBSURLDMSURLWER' FROM DUAL)
SELECT REGEXP_SUBSTR (REPLACE (sd.ab, 'URL', ','),
                      '[^,]+',
                      1,
                      lines.COLUMN_VALUE)    AS ab
  FROM some_data  sd,
       TABLE (CAST (MULTISET (    SELECT LEVEL     AS level_num
                                    FROM DUAL
                              CONNECT BY INSTR (sd.ab,
                                                'URL',
                                                1,
                                                LEVEL) > 0) AS SYS.odciNumberList)) lines
 WHERE lines.COLUMN_VALUE > 1;

Output

    AB
______
CBS
DMS
Sign up to request clarification or add additional context in comments.

6 Comments

Unfortunately, The result is wrong. It retrieved ABC and WE as result of this query as well.
@EJEgyed, the OP wants values between URLs only, i.e. CBS and DMS
This is wrong (apart from it doesn't do what the question asks) because [^(URL)] matches any one character that is not an opening round bracket ( or U or R or L or a closing round bracket ). So if you had the input as 'URL(LUR)(LUR)URL' then you would get no characters matched.
I have fixed the query so it should return proper results now.
If the string is 'ABCURLCB,SURLDMSURLWER' this this only returns CB and S rather than CB,S and DMS. db<>fiddle
|
1

Yet another option; see comments within code:

SQL> with test (col) as
  2    -- sample data
  3    (select 'ABCURLCBSURLDMSURLWER' from dual),
  4  rpl as
  5    -- replace URL with a semi-colon (a single/simple delimiter)
  6    (select replace(col, 'URL', ';') col
  7     from test
  8    ),
  9  rmv as
 10    -- remove everything in front of the 1st delimiter and everything after the last delimiter
 11    (select substr(col, instr(col, ';') + 1,
 12                        instr(col, ';', -1, 1) - instr(col, ';') - 1) val
 13     from rpl
 14    )
 15  select regexp_substr(val, '[^;]+', 1, level) result
 16  from rmv
 17  connect by level <= regexp_count(val, ';') + 1;

RESULT
--------------------
CBS
DMS

SQL>

2 Comments

If the string is 'ABCURLCB;SURLDMSURLWER' then this returns 3 rows instead of 2. db<>fiddle
Certainly, @MT0, but sample data says nothing about containing semi-colons within the string. On the other hand, it says nothing about NOT containing it either :)
0

You don't need regular expressions and can do this with simple string functions:

WITH bounds ( id, value, start_pos, end_pos ) AS (
  SELECT id,
         value,
         INSTR( value, 'URL', 1, 1 ) + 3,
         INSTR( value, 'URL', 1, 2 )
  FROM   table_name
UNION ALL
  SELECT id,
         value,
         end_pos + 3,
         INSTR( value, 'URL', end_pos + 3, 1 )
  FROM   bounds
  WHERE  end_pos > 0
)
SELECT id,
       start_pos,
       SUBSTR( value, start_pos, end_pos - start_pos ) AS url
FROM   bounds
WHERE  end_pos > 0
ORDER BY id, start_pos;

So, for the sample data:

CREATE TABLE table_name ( id, value ) AS
SELECT 1, 'ABCURLCBSURLDMSURLWER' FROM DUAL UNION ALL
SELECT 2, 'ABCURLURLDEFURLGHIURL' FROM DUAL;

This outputs:

ID | START_POS | URL 
-: | --------: | :---
 1 |         7 | CBS 
 1 |        13 | DMS 
 2 |         7 | null
 2 |        10 | DEF 
 2 |        16 | GHI 

db<>fiddle here


Option 2

If you did want to use regular expressions then you can use:

SELECT t.id,
       x.COLUMN_VALUE AS url
FROM   table_name t
       CROSS APPLY TABLE(
         CAST(
           MULTISET(
             SELECT REGEXP_SUBSTR(
                      t.value,
                      '(.*?)URL',
                      INSTR( t.value, 'URL' ) + 3,
                      LEVEL,
                      NULL,
                      1
                    )
              FROM  DUAL
              CONNECT BY
                     LEVEL <= REGEXP_COUNT(
                                t.value, '(.*?)URL', INSTR( t.value, 'URL' ) + 3
                              )
           )
           AS SYS.ODCIVARCHAR2LIST
         )
       ) x;

Which, for the same test data, outputs:

ID | URL 
-: | :---
 1 | CBS 
 1 | DMS 
 2 | null
 2 | DEF 
 2 | GHI 

db<>fiddle here

2 Comments

Option 2 would not work if your test_data table had more than one row in it as you would get duplicated data.
@EJEgyed The OP only had a single row of input. However, that's an easy fix so I've updated it to handle multiple rows. Just take the query and CROSS APPLY/CROSS JOIN it to the multi-row input table inside a table collection expression.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.