1

I need to replace multiple words such as (dog|cat|bird) with nothing in a string where there may be multiple consecutive occurrences of a word. The actual code is to remove salutations and suffixes from a name. Unfortunately the garbage data I get sometimes contains "SNERD JR JR."

I was able to create a regular expression pattern that accomplishes my goal but only for the first occurrence. I implemented a stupid hack to get rid of the second occurrence, but I believe there has to be a better way. I just can't figure it out.

Here is my "hacked" code;

  FUNCTION REMOVE_SALUTATIONS(IN_STRING VARCHAR2) RETURN VARCHAR2 DETERMINISTIC
  AS
    REGEX_SALUTATIONS VARCHAR2(4000) := '(^|\s)(MR|MS|MISS|MRS|DR|MD|M D|SR|SIR|PHD|P H D|II|III|IV|JR)(\.?)(\s|$)';
  BEGIN
    RETURN TRIM(REGEXP_REPLACE(REGEXP_REPLACE(IN_STRING,REGEX_SALUTATIONS,' '),REGEX_SALUTATIONS,''));
  END REMOVE_SALUTATIONS;

I was actually proud that I was able to get this far, as regular expression are not very regular to me. All help is appreciated.

EDIT:

The default for regexp_replace based on my understanding is to do a global replace. But on the outside chance my DB is configured different I did try;

select REGEXP_REPLACE('SNERD JR JR','(^|\s)(MR|MS|MISS|MRS|DR|MD|M D|SR|SIR|PHD|P H D|II|III|IV|JR)(\.?)(\s|$)',' ',1,0) from dual;

and the results are;

SNERD JR
3
  • Your regexp takes too much: it needs both spaces around the first JR demo. Could you check if lookahead works in oracle: demo with lookahead? Commented Feb 26, 2018 at 16:10
  • Yes I see the problem is as you stated. Oracle does not handle look aheads as far as I can tell. I tried your example and also researched it. The example does not work, and what I've read says Oracle does not do look aheads. Commented Feb 26, 2018 at 21:02
  • I tried to do it without lookahead: (^|\b|\s)(MR|MS|MISS|MRS|DR|MD|M D|SR|SIR|PHD|P H D|II|III|IV|JR)(\.?)(\s|$), maybe this will help Demo Commented Feb 26, 2018 at 23:03

1 Answer 1

0

Use occurrence parameter of REGEXP_REPLACE function. The docs says:

occurrence is a nonnegative integer indicating the occurrence of the replace operation:

  • If you specify 0, then Oracle replaces all occurrences of the match.
  • If you specify a positive integer n, then Oracle replaces the nth occurrenc

https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions137.htm#SQLRF06302

It should look like:

...
REGEXP_REPLACE(IN_STRING,REGEX_SALUTATIONS,' ', 1,0 )
...
Sign up to request clarification or add additional context in comments.

6 Comments

mrzasa, thanks for your help, I had tried that earlier and did not mention it in my post. BTW the 4th parameter must be > 0. See my edited post above.
just answered, see below the question
mrzasa, Using a 0 in the 5th parameter does not work on my DB (11g). I posted an example in an "EDIT" in the original question that you can easily test.
I could it easily test if I had oracle 11g installed :). As I don't have access to it, I can advise you basing on docs and my regexp knowledge. Or I can stop doing so.
mrzasa I appreciate your help. Unfortunately, even though you told me where to look, I did not see your post about look ahead. My apologies.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.