1

I have a | separated string with 20 |s like 123|1|42|13||94123|2983191|2|98863|...|211| upto 20 |. This is a oracle db column. The string is just 20 numbers followed by |.

I am trying to get a string out from it where I remove the numbers at position 4,6,8,9,11,12 and 13. Also, need to move the number at position 16 to position 4. Till now, I have got a regex like

select regexp_replace(col1, '^((\d*\|){4})(\d*\|)(\d*\|)(\d*\|)(\d*\|)((\d*\|){2})(\d*\|)((\d*\|){3})((\d*\|){2})(\d*\|)(.*)$', '\1|\4|\6||\9||||||||') as cc from table

This is where I get stuck since oracle only supports backreference upto 9 groups. Is there any way to make this regex simpler so it has lesser groups and can be fit into the replace? Any alternative solutions/suggestions are also welcome.

Note - The position counter begins at 0, so 123 in above string is the 0th number.

Edit: Example -

Source string

|||14444|10107|227931|10115||10118||11361|11485||10110||11512|16666|||

Expected result

|||16666|10107||10115||||||11512||||

3
  • Is there any other column in that table that uniquely identifies every row? Commented Nov 4, 2019 at 7:41
  • @Littlefoot - yes, there is a id column on this Commented Nov 4, 2019 at 7:43
  • Your expected result doesn't seem to agree with the first position being 0, because in that case 10107 is actually the number at position 4 that should be replaced with 16666. Commented Nov 4, 2019 at 9:36

2 Answers 2

2

You can get the result you want by removing capture groups for the numbers you are removing from the string anyway, and writing (for example) ((\d*\|){2}) as (\d*\|\d*\|). This reduces the number of capture groups to 7, allowing your code to work as is:

select regexp_replace(col1, 
     '^(\d*\|\d*\|\d*\|\d*\|)\d*\|(\d*\|)\d*\|(\d*\|)\d*\|\d*\|(\d*\|)\d*\|\d*\|\d*\|(\d*\|\d*\|)(\d*\|)(.*)$', 
     '\1\6\2|\3||\4|||\5|\7') as cc 
from table

Output (for your test data and also @Littlefoot good column example):

CC
|||14444|16666|227931|||||11361|||||11512|||||
0|1|2|3|16|5||7|||10||||14|15||17|18|19|

Demo on dbfiddle

Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I was after. Thanks
2

As there's unique column (ID, as you said), see if this helps:

  • split every column into rows
  • compose them back (using listagg) which uses 2 CASEs:
    • one to remove values you don't want
    • another to properly sort them ("put value at position 16 to position 4")

Note that my result differs from yours; if I counted it correctly, 16666 isn't at position 16 but 17 so - 11512 has to be moved to position 4.

I also added another dummy row which is here to confirm whether I counted positions correctly, and to show why you have to use lines #10-12 (because of duplicates).

OK, here you are:

SQL> with test (id, col) as
  2    (
  3    select 1, '|||14444|10107|227931|10115||10118||11361|11485||10110||11512|16666|||' from dual union all
  4    select 2, '1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20'                     from dual
  5    ),
  6  temp as
  7    (select replace(regexp_substr(col, '(.*?)(\||$)', 1, column_value), '|', '') val,
  8            column_value lvl,
  9            id
 10     from test cross join table(cast(multiset(select level from dual
 11                                              connect by level <= regexp_count(col, '\|') + 1
 12                                             ) as sys.odcinumberlist))
 13    )
 14  select id,
 15    listagg(case when lvl in (4, 6, 8, 9, 11, 12, 13) then '|'
 16                 else val || case when lvl = 20 then '' else '|' end
 17            end, '')
 18            within group (order by case when lvl = 16 then 4
 19                                        when lvl =  4 then 16
 20                                        else lvl
 21                                   end) result
 22  from temp
 23  group by id;

        ID RESULT
---------- ------------------------------------------------------------
         1 |||11512|10107||10115|||||||10110|||16666|||
         2 1|2|3|16|5||7|||10||||14|15||17|18|19|20

SQL>

4 Comments

This isn't actually correct. There's an extra | between the values for columns 4 and 5, and no | for the missing column 16 value.
So true, @Nick; thank you. Fixed, with a little bit more typing & case involvement.
Appreciate the effort. Nick's answer is closer to what I am doing though and would be much faster in my case, so going with it. Cheers
No problem at all. Good luck!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.