1

I need to check if a partial name matches full name. For example:

Partial_Name   |  Full_Name
--------------------------------------
John,Smith     |  Smith William John
Eglid,Timothy  |  Timothy M Eglid

I have no clue how to approach this type of matching.

Another thing is that name and last name may come in the wrong order, making it harder.

I could do something like this, but this only works if names are in the same order and 100% match

decode(LOWER(REGEXP_REPLACE(Partial_Name,'[^a-zA-Z'']','')), LOWER(REGEXP_REPLACE(Full_Name,'[^a-zA-Z'']','')), 'Same', 'Different')

1
  • I would create a set of tokens from the full_name, and then see if one or more patterns in partial_name (separated by comma) exist in that token set. Commented Nov 21, 2014 at 22:28

3 Answers 3

2

you could use this pattern on the text provided - works for most engines

([^ ,]+),([^ ,]+)(?=.*\b\1\b)(?=.*\b\2\b)  

Demo

Sign up to request clarification or add additional context in comments.

3 Comments

Some test cases that users of this may or may not be interested in: For Full_Name "Smith William John", "Smith" is not a match, "Smith,Will" is not a match, "Smith, William" is not a match.
Will is suppressed by the use of \b to prevent unwanted name matching - unless OK with OP, and Smith, William has an extra space which is not "the text provided", my pattern is as good as the example provided !
Sure. This may be exactly what the asker is looking for. I was just "stress testing" it for harder or smarter cases.
1
WITH 
/*
tab AS
(
SELECT 'Smith William John' Full_Name, 'John,Smith' Partial_Name FROM dual
UNION ALL SELECT 'Timothy M Eglid', 'Eglid,timothy' FROM dual
UNION ALL SELECT 'Tim M Egli', 'Egli,Tim,M2' FROM dual
UNION ALL SELECT 'Timot M Eg', 'Eg' FROM dual
),
*/
tmp AS (
  SELECT Full_Name, Partial_Name,
         trim(CASE WHEN instr(Partial_Name, ',') = 0 THEN Partial_Name         
                   ELSE regexp_substr(Partial_Name, '[^,]+', 1, lvl+1)
              END) token
  FROM tab t CROSS JOIN (SELECT lvl FROM (SELECT LEVEL-1 lvl FROM dual 
                         CONNECT BY LEVEL <= (SELECT MAX(LENGTH(Partial_Name) - LENGTH(REPLACE(Partial_Name, ',')))+1 FROM tab)))
  WHERE LENGTH(Partial_Name) - LENGTH(REPLACE(Partial_Name, ',')) >= lvl
) 
SELECT Full_Name, Partial_Name
FROM tmp
GROUP BY Full_Name, Partial_Name
HAVING count(DISTINCT token) 
       = count(DISTINCT CASE WHEN REGEXP_LIKE(Full_Name, token, 'i') 
                             THEN token ELSE NULL END);                             

In the tmp each partial_name is splitted on tokens (separated by comma)

The resulting query retrieves only those rows which full_name matches all the corresponding tokens.

This query works with the dynamic number of commas in partial_name. If there can be only zero or one commas then the query will be much easier:

SELECT * FROM tab 
WHERE instr(Partial_Name, ',') > 0 
         AND REGEXP_LIKE(full_name, substr(Partial_Name, 1, instr(Partial_Name, ',')-1), 'ix')
         AND REGEXP_LIKE(full_name, substr(Partial_Name,instr(Partial_Name, ',')+1), 'ix')
      OR instr(Partial_Name, ',') = 0 
         AND REGEXP_LIKE(full_name, Partial_Name, 'ix');

Comments

0

This is what I ended up doing... Not sure if this is the best approach. I split partials by comma and check if first name present in full name and last name present in full name. If both are present then match.

CASE 
    WHEN 
     instr(trim(lower(Full_Name)), 
     trim(lower(REGEXP_SUBSTR(Partial_Name, '[^,]+', 1, 1)))) > 0
       AND 
     instr(trim(lower(Full_Name)), 
     trim(lower(REGEXP_SUBSTR(Partial_Name, '[^,]+', 1, 2)))) > 0
    THEN 'Y'
    ELSE 'N'
END AS MATCHING_NAMES

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.