1

I have the following string:

H: 290​‐​314 P: 280​‐​301+330​​​​U+200B+331​string‐​305+351+338​‐​308+310 [2]

I need all the numbers after P:: [280,301,330,331,305,351,338,308,310].

Note that there is this U+200B which is a char-code and should be ignored.

I tried #P:\s((\d+)[​\‐]+)+# but that doesn't work.

1

2 Answers 2

1

I'd use the continue operator this way: (Demo)

$str = 'H: 290‐314 P: 280‐301+330U+200B+331string‐305+351+338‐308+310 [2]';
preg_match_all('~(?:P: |\G(?!^)(?:U\+200B)?[^\d ]+)\K\d+~', $str, $m);
var_export($m[0]);

Start from P: then match consecutive digits. Consume non-digit, non-spaces, and your blacklisted string as delimiters. Forget unwanted substrings with \K.

Sign up to request clarification or add additional context in comments.

Comments

0

You can use

(?:\G(?!\A)(?:[^\d\s]*200B)?|P:\h*)[^\d\s]*\K(?!200B)\d+

See the regex demo.

Details:

  • (?:\G(?!\A)(?:[^\d\s]*200B)?|P:\h*) - either the end of the previous successful match and then any zero or more chars other than digits/whitespace and 200B, or P: and zero or more horizontal whitespaces
  • [^\d\s]* - zero or more chars other than digits and whitespace
  • \K - match reset operator that discards the text matched so far from the overall match memory buffer
  • (?!200B)\d+ - one or more digits that are not starting the 200B char sequence.

See the PHP demo:

$text = 'H: 290‐314 P: 280‐301+330U+200B+331string‐305+351+338‐308+310 [2]';
if (preg_match_all('~(?:\G(?!\A)(?:[^\d\s]*200B)?|P:\h*)[^\d\s]*\K(?!200B)\d+~', $text, $matches)) {
    print_r($matches[0]);
}

Output:

Array
(
    [0] => 280
    [1] => 301
    [2] => 330
    [3] => 331
    [4] => 305
    [5] => 351
    [6] => 338
    [7] => 308
    [8] => 310
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.