0

I need to extract from a DB,in which records in one column are combined in this way: first letter(Firstname1). Lastname1,first letter(Firstname2). Lastname2,....

here is an example of how I tried to resolve...

     $text2= "T. Toth, M. A. Carlo de Miller, T. Stallone";
     $keywords = preg_split("/,/", "$text2");

     print_r($keywords);

    //I got a result in this way:

    //Array ( [0] => T. Toth [1] => M. A. Carlo de Miller [2] => T. Stallone ) 

    // I want a result of the form :

    //Array ( [0] => T [1] => Toth [2] => M. A. [3] => Carlo de Miller [4] => T  and    so on....

Someone can get an idea of how to proceed?even if it can be in MYSQL

1
  • 1
    this is not exactly easily done. how are you going differentiate the T. and M. from T.Toth and M.A.Carlo? Both are initials, but obviously you want them treated differently Commented Mar 8, 2013 at 20:58

3 Answers 3

1

One more variant:

$text2= "T. Toth, M. A. Carlo de Miller, T. Stallone";
$result = array();
foreach (explode(",",$text2) as $row)
{
  $row = explode(".",$row);
  $last = array_pop($row);
  $result[] = join(".",$row).".";
  $result[] = $last;
}
print_r($result);

Result:

Array ( [0] => T. [1] => Toth [2] => M. A. [3] => Carlo de Miller [4] => T. [5] => Stallone )
Sign up to request clarification or add additional context in comments.

Comments

0

I think this regular expression should more or less do what you want:

/
  (?:^|,)           # Start of subject or comma
  \s*               # Optional white space
  ((?:[a-z]\.\s*)+) # At least one occurrence of alpha followed by dot
  \s*               # Consume trailing whitespace
/ix

When used in combination with the PREG_SPLIT_NO_EMPTY and PREG_SPLIT_DELIM_CAPTURE capture flags, this expression will obtain the result you want, the only caveat is that it will also capture some leading/trailing whitespace. I can't see a way to avoid this, and it can be easily trimmed off when you use the result.

$str = 'T. Toth, M. A. Carlo de Miller, T. Stallone';
$expr = '/(?:^|,)\s*((?:[a-z]\.\s*)+)\s*/i';
$flags = PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE;

$keywords = preg_split($expr, $str, -1, $flags);

print_r($keywords);

See it working

Comments

0

preg_split may not be the right function for this. Try this with preg_match_all:

$text2= "T. Toth, M. A. Carlo de Miller, T. Stallone";
preg_match_all("/\w{2,}(?:\s\w{2,})*|\w\.(?:\s\w\.)*/i", $text2, $matches);
print_r($matches[0]);

This picks out names and initials, while leaving out leading/trailing white-spaces.

First match whole name: \w{2,}(?:\s\w{2,})*

Second match initials: \w\.(?:\s\w\.)*

Results in:

Array ( [0] => Array ( [0] => T. [1] => Toth [2] => M. A. [3] => Carlo de Miller [4] => T. [5] => Stallone ) )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.