76

I need to parse an HTML document and to find all occurrences of string asdf in it.

I currently have the HTML loaded into a string variable. I would just like the character position so I can loop through the list to return some data after the string.

The strpos function only returns the first occurrence. How about returning all of them?

0

10 Answers 10

109

Without using regex, something like this should work for returning the string positions:

$html = "dddasdfdddasdffff";
$needle = "asdf";
$lastPos = 0;
$positions = array();

while (($lastPos = strpos($html, $needle, $lastPos))!== false) {
    $positions[] = $lastPos;
    $lastPos = $lastPos + strlen($needle);
}

// Displays 3 and 10
foreach ($positions as $value) {
    echo $value ."<br />";
}
Sign up to request clarification or add additional context in comments.

3 Comments

Please be careful using assignments in if statements. In this case, your while loop didn't work for position 0. I've updated your answer.
Excelent fix, but for those needing to find special characters (é, ë, ...) replace the strpos with mb_strpos, otherwise it won't work
All of those who will reuse this code be careful because your needle may be something like "dd" in which case $lastPos should only increase by one inside the while loop.
24

You can call the strpos function repeatedly until a match is not found. You must specify the offset parameter.

Note: in the following example, the search continues from the next character instead of from the end of previous match. According to this function, aaaa contains three occurrences of the substring aa, not two.

function strpos_all($haystack, $needle) {
    $offset = 0;
    $allpos = array();
    while (($pos = strpos($haystack, $needle, $offset)) !== FALSE) {
        $offset   = $pos + 1;
        $allpos[] = $pos;
    }
    return $allpos;
}
print_r(strpos_all("aaa bbb aaa bbb aaa bbb", "aa"));

Output:

Array
(
    [0] => 0
    [1] => 1
    [2] => 8
    [3] => 9
    [4] => 16
    [5] => 17
)

Comments

17

Its better to use substr_count . Check out on php.net

2 Comments

this only gives you the count, not their positions as the question asked
"This function doesn't count overlapped substrings." For string 'abababa' when you look 'aba' it will count only 2 times not 3
4
function getocurence($chaine,$rechercher)
        {
            $lastPos = 0;
            $positions = array();
            while (($lastPos = strpos($chaine, $rechercher, $lastPos))!== false)
            {
                $positions[] = $lastPos;
                $lastPos = $lastPos + strlen($rechercher);
            }
            return $positions;
        }

1 Comment

Code-only answers are low value on StackOverflow because they do very little to educate the OP and future readers. Please edit your answer with the intent to educate thousands of future SO readers and the OP.
3

This can be done using strpos() function. The following code is implemented using for loop. This code is quite simple and pretty straight forward.

<?php

$str_test = "Hello World! welcome to php";

$count = 0;
$find = "o";
$positions = array();
for($i = 0; $i<strlen($str_test); $i++)
{
     $pos = strpos($str_test, $find, $count);
     if($pos == $count){
           $positions[] = $pos;
     }
     $count++;
}
foreach ($positions as $value) {
    echo '<br/>' .  $value . "<br />";
}

?>

Comments

2

Use preg_match_all to find all occurrences.

preg_match_all('/(\$[a-z]+)/i', $str, $matches);

For further reference check this link.

1 Comment

He's looking for string positions, not just matches. Also he's looking to match "asdf", not [a-z]...
2

Salman A has a good answer, but remember to make your code multibyte-safe. To get correct positions with UTF-8, use mb_strpos instead of strpos:

function strpos_all($haystack, $needle) {
    $offset = 0;
    $allpos = array();
    while (($pos = mb_strpos($haystack, $needle, $offset)) !== FALSE) {
        $offset   = $pos + 1;
        $allpos[] = $pos;
    }
    return $allpos;
}
print_r(strpos_all("aaa bbb aaa bbb aaa bbb", "aa"));

Comments

1

Another solution is to use explode():

public static function allSubStrPos($str, $del)
{
    $searchArray = explode($del, $str);
    unset($searchArray[count($searchArray) - 1]);
    $positionsArray = [];
    $index = 0;
    foreach ($searchArray as $i => $s) {
        array_push($positionsArray, strlen($s) + $index);
        $index += strlen($s) + strlen($del);
    }
    return $positionsArray;
}

1 Comment

The other solutions are easily made case insensitive if needed, this solution is not. It is also about 50% slower on my machine.
0

Simple strpos_all() function.

function strpos_all($haystack, $needle_regex)
{
    preg_match_all('/' . $needle_regex . '/', $haystack, $matches, PREG_OFFSET_CAPTURE);
    return array_map(function ($v) {
        return $v[1];
    }, $matches[0]);
}

Usage: Simple string as needle.

$html = "dddasdfdddasdffff";
$needle = "asdf";

$all_positions = strpos_all($html, $needle);
var_dump($all_positions);

Output:

array(2) {
  [0]=>
  int(3)
  [1]=>
  int(10)
}

Or with regex as needle.

$html = "dddasdfdddasdffff";
$needle = "[d]{3}";

$all_positions = strpos_all($html, $needle);
var_dump($all_positions);

Output:

array(2) {
  [0]=>
  int(0)
  [1]=>
  int(7)
}

2 Comments

Using regular expressions to look for a substring is not a good approach. Of course you can do it but regex is for more complex scenarios. Using strpos is much simpler in this case and does the job.
A warning about getting the offset in a string that may have multibyte characters: preg_match and UTF-8 in PHP
0
<?php
$mainString = "dddjmnpfdddjmnpffff";
$needle = "jmnp";
$lastPos = 0;
$positions = array();

while (($lastPos = strpos($html, $needle, $lastPos))!== false) {
    $positions[] = $lastPos;
    $lastPos = $lastPos + strlen($needle);
}

// Displays 3 and 10
foreach ($positions as $value) {
    echo $value ."<br />";
}
?>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.