2

Trying to write a regex that can parse a full name and split it into first name, middle name, last name. This should be easy but its pretty hard once you see the kind of names I have to parse. Now I could write a big long regex that takes into accout all these different cases but I think a smaller dynamic regex is possible and that's why I am here asking for some help.

I think these are all of the types of names I have to grab.

Some example names that need to be parsed are(each have three commas at the end):

(first name) (middle intial). (last name),,, //one middle initial with period after
(first name) (last name),,,                  //simple first and last
(No name),,,                                 //no name
(first name) (last name)-(last name),,,      //two last names separated by a dash
(first name) (middle initial). (middle initial). (last name),,,   //two middle initials with space inbetween
(first name) (last name w/ apostrophe),,,    //Last names with apostrophes 
(first name) (Middle name) (Last name),,,    //first middle and last name
2
  • I already used the split command to get each field split up now I am just trying to get the name split. Commented Feb 24, 2012 at 16:09
  • They're all separated by spaces. Just use /(\S+ )?(\S+ )?(\S+ )?(\S+)?,,,/ Commented Feb 24, 2012 at 16:19

3 Answers 3

4

You can't parse something that ultimately follows no rules and hope to have any success. The problem is not translating the algorithm to a regular expression, but writing the algorithm to begin with.

Consider: how would you write an algorithm that could properly parse all these names into Given, Middle, and Family names?

  • Bob Mac Intosh
  • Mary Jane Watson
  • Thurston Powell III
  • Michael van der Velden
  • Jacqueline Kennedy Onassis
  • Dr. Jean Grey
  • Takahashi Shiro
  • Michel La Fontaine
  • Sir Alec Guinness
  • Mary-Sue Bowes-Lyon
  • Sacha Baron Cohen
  • Jack Arnold Jr.

See what I mean? You'd need an AI to be able to properly chunk each of these words into the proper context. Some people use two names as their "given" name. Some people use titles or honorifics, and some cultures place their family name first and given name last.

Summary: Don't do it. If you cannot get the user to separate their name into specific chunks for you, you must treat them as atoms.

Sign up to request clarification or add additional context in comments.

Comments

3

No code, but try:

  1. use substr to remove the last three characters off $name,
  2. @array = split /[\s+.]+/, $name # split on space and/or dots (as mentioned above) into an array,
  3. if ($array[0]) then you have a name,
  4. $lastname = pop @array; # gets the last (or only) name
  5. $firstname = shift @array if scalar @array; # first name is first element
  6. @array now contains all middle names and/or initials

Something like that, anyway...

Comments

3
use 5.010;
use DDS;
for (<DATA>) {
    chomp;
    s/,,,.*//;
    if (' ' eq $_) {
        say 'no name';
    } else {
        /\A (?<first>\S+) \s+ (?<middle>.*?)? (?:\s+)? (?<last>\S+) \z/msx;
        DumpLex \%+;
    }
}

__DATA__
Foo B. Baz,,,
Fnord Quux,,,
 ,,,
Xyzzy Bling-Bling,,,
Abe C. D. Efg,,,
Ed O'postrophe,,,
First Middle Last,,,

$HASH1 = {
           first  => 'Foo',
           last   => 'Baz',
           middle => 'B.'
         };
$HASH1 = {
           first  => 'Fnord',
           last   => 'Quux',
           middle => ''
         };
no name
$HASH1 = {
           first  => 'Xyzzy',
           last   => 'Bling-Bling',
           middle => ''
         };
$HASH1 = {
           first  => 'Abe',
           last   => 'Efg',
           middle => 'C. D.'
         };
$HASH1 = {
           first  => 'Ed',
           last   => 'O\'postrophe',
           middle => ''
         };
$HASH1 = {
           first  => 'First',
           last   => 'Last',
           middle => 'Middle'
         };

1 Comment

I just noticed and this is my fault but when there is no name supplied it doesnt literally say not name its just a space and three commas. " ,,,"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.