Oracle PL/SQL regexp_replace for multiple words

Question

I have a string 'TICKER: IBM IBM Corporation Inc.' and I want to remove the ticker and its value and grab just the remaining in Oracle PL/SQL.

So I made this query but it is not working the way I intended:

SELECT REGEXP_REPLACE(
           'TICKER: IBM IBM Corporation Inc.',
           '(.*):[[:space:]](.*)[[:space:]](.*)', '\3')
      FROM dual;

I was hoping that '\3' would yield me 'IBM Corporation Inc.' but I get just 'Inc.' as the result.

REGEXP_REPLACE('TICKER:IBMIBMCORPORATIONINC.','(.*):[[:SPACE:]](.*)[[:SPACE:]](.*)','\3') 
----------------------------------------------------------------------------- 
Inc.                                                                                      

1 rows selected

Update:

SELECT REGEXP_REPLACE(
       'TICKER: IBM IBM Corporation Inc.',
       '(.*):[[:space:]](.*)[[:space:]](.*)', '\1|\2|\3')
  FROM dual;

Result:

REGEXP_REPLACE('TICKER:IBMIBMCORPORATIONINC.','(.*):[[:SPACE:]](.*)[[:SPACE:]](.*)','\1|\2|\3') 
-------------------------------------------------------------------------------- 
TICKER|IBM IBM Corporation|Inc.

What am I missing in the regular expression?

Thanks.

Q: Have you tried '\2'? See also: docs.oracle.com/cd/B19306_01/server.102/b14200/… — paulsm4
– paulsm4, Commented May 5, 2016 at 18:33

earachefl · Accepted Answer · 2016-05-05 18:41:48Z

2

SELECT REGEXP_REPLACE(
       'TICKER: IBM IBM Corporation Inc.',
       '(.*):[[:space:]]([^ ]*)[[:space:]](.*)', '\3')
  FROM dual;

Your second capturing expression was grabbing everything, including the next space.

I should mention that I tested in Oracle, not PL/SQL. I would think there'd be no difference though.

PS: the following alternates work as well:

-- using only one capturing expression
SELECT REGEXP_REPLACE(
       'TICKER: IBM IBM Corporation Inc.',
       '.*: [^ ]* (.*)', '\1')
  FROM dual;

  -- using no capturing expressions
  SELECT REGEXP_REPLACE(
       'TICKER: IBM IBM Corporation Inc.',
       '.*: [^ ]* ', '')
  FROM dual;

edited May 5, 2016 at 18:41

answered May 5, 2016 at 18:36

earachefl

1,8907 gold badges32 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Gary_W Over a year ago

I was thinking along the same lines of replacing the first 2 words with NULL, assuming they are always there and the value (symbol?) will always be 1 word): '\w+: \w+ '

Gary_W Over a year ago

Should probably tighten it up a little by anchoring to the start of the string: '^\w+: \w+ '.

MT0 · Accepted Answer · 2016-05-05 20:21:20Z

1

SELECT REGEXP_REPLACE(
           'TICKER: IBM IBM Corporation Inc.',
           '^(.*?):\s(\S*)\s(.*)$',
           '\3'
       )
FROM DUAL;

or, your code does not need many changes to make it work (anchoring it to the start of the string and converting the first two wild-card matches to be non-greedy):

SELECT REGEXP_REPLACE(
           'TICKER: IBM IBM: Corporation Inc.',
           '^(.*?):[[:space:]](.*?)[[:space:]](.*)',
           '\3'
        )
FROM DUAL;

edited May 5, 2016 at 20:21

answered May 5, 2016 at 19:00

MT0

173k12 gold badges70 silver badges136 bronze badges

5 Comments

Gary_W Over a year ago

Unlikely, but one never knows, Try it with a company name containing a colon: 'TICKER: IBM :IBM Co:rporation: Inc.'. Goes to show a query should be run to check for colons in the data first I guess.

JKK Over a year ago

Good point on description containing a colon. Your query seems to be handling it perfectly. Thanks. +1

Gary_W Over a year ago

@JKK Always expect the unexpected! Depending on the source of the data and how well it is (or most likely isn't) validated, all kinds of crud can be accepted and end up in the database. Always do some sanity checking against the data before making assumptions like "the company names will never contain a colon" :-)

JKK Over a year ago

Yep agreed. In my scenario, this data is always well maintained (because it is being made by another layer) and free from any non-alpha or special characters. Having said that, I'm going with the suggested approach (for just in case). Thanks.

MT0 Over a year ago

@JKK fixed the : issue and also added a simple fix for your original query.

Collectives™ on Stack Overflow

Oracle PL/SQL regexp_replace for multiple words

2 Answers 2

2 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related