1

I am trying to extract all strings that look like 12-15 from a parent string. This means all strings that have a dash in between two digits.

Using this answer as a basis, I tried the following:

<?php

$str = "34,56,67-90,45";
preg_match('/^(\d-\d)|(,\d-\d)|(\d-\d,)|(,\d-\d,)$/', $str, $output, PREG_OFFSET_CAPTURE);
echo print_r($output);

?>

This looks for any substring that looks a dash enclosed between digits, whether it has a comma before, after, or both, or none. When I run the PHP code, I get an empty array. On Regex101, when I test the regular expression, strings like 4-5,,,,, seem to, and I'm not understanding why it's letting me add extra commas.

What's wrong with my regex that I get an empty array?

2
  • Why are you using echo and print_r on the same line? print_r already prints the array, you don't need to call echo. Commented Sep 4, 2015 at 2:28
  • @Barmar it was a mistake, I usually add true after print_r() Commented Sep 4, 2015 at 2:31

3 Answers 3

4

I think you could use a simple regex like this

\d+[-]\d+

That is (match at least 1 digit) (match a literal dash) (match at least 1 digit)

Sign up to request clarification or add additional context in comments.

Comments

2

\d matches a single digit. All the numbers in your sample string have two digits. You should use \d+ to match any number of digits.

preg_match('/^(\d+-\d+)|(,\d+-\d+)|(\d+-\d+,)|(,\d+-\d+,)$/', $str, $output, PREG_OFFSET_CAPTURE);

Output:

Array
(
    [0] => Array
        (
            [0] => ,67-90
            [1] => 5
        )

    [1] => Array
        (
            [0] => 
            [1] => -1
        )

    [2] => Array
        (
            [0] => ,67-90
            [1] => 5
        )

)

You can also simplify the regexp:

preg_match('/(?:^|,)\d+-\d+(?:,|$)/', $str, $output, PREG_OFFSET_CAPTURE);

Output:

Array
(
    [0] => Array
        (
            [0] => ,67-90,
            [1] => 5
        )

)

5 Comments

How would I extract ALL instances of the pattern into the array?
Use preg_match_all instead of preg_match.
The flags for preg_match_all are a bit confusing. I just wanted an array where I can iterate through with the matches found in the parent string. It seems as though, if I use the flag PREG_PATTERN_ORDER and do preg_match_all($str, $output, PREG_PATTERN_ORDER);, $output[0] is the array I'm looking for. I've done a few tests, and they seem to be consistent. To make sure, am I correct about this?
Yes, that's correct. And I agree, the options are a bit confusing, I can never remember which way is the default.
Your regex fails when there are consecutive pairs of numbers 34,56,67-90,34-53,24-23 regex101.com/r/kO2iD7/1
1

The | has precedence, meaning your expression is interpreted as "MATCH EITHER ONE OF THE FOLLOWING:

  1. START of text -> 1 digit -> dash -> 1 digit (not matching end of text)
  2. Comma (may be in the middle of the text, anywhere) -> 1 digit -> dash -> 1 digit
  3. 1 digit (anywhere) -> dash -> 1 digit -> comma
  4. comma (anywhere) -> 1 digit -> dash -> 1 digit -> comma -> END of text

Also, your are using \d which matches 1 digit (only one character). You can use \d{2} to match 2 digits (00 to 99), or \d+ to match any integer (1, 55, 123456, etc).


In your case, I think you're trying to use this expression:

/(?:^|,)(\d+-\d+)(?=,|$)/

which means: START of text OR comma -> any integer -> dash -> any integer -> followed by (but not consuming inmatch) a comma OR END of text

2 Comments

Your regex fails when there are consecutive pairs of numbers 34,56,67-90,34-53,24-23
True. I was mislead by the use of preg_match instead of preg_match_all. EDITed in answer with a lookahead. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.