0

I have an html file containing

 <img width="10" height="12" scr="https://www.site.com/yughggcfgh">
<img width="11" height="15" scr="https://www.site.com/yughggcfghcvbcvb">
<img width="10" height="12" scr="https://www.site.com/a.jpg">
<img width="10" height="12" scr="https://www.site.com/b.gif">

I want to extract the paths of images which doesn't have an extention in an array,
The output must be as follows

ari[1]= <img width="10" height="12" scr="https://www.site.com/yughggcfgh">
ari[2]= <img width="11" height="15" scr="https://www.site.com/yughggcfghcvbcvb"> 
2
  • possible duplicate of How to parse and process HTML with PHP? Commented Apr 4, 2012 at 11:58
  • I think you have a typo scr=src= Commented Apr 4, 2012 at 12:01

2 Answers 2

2

You really should use domDocument or some html parser not regex heres an example:

<?php 
$somesource='<img width="10" height="12" src="https://www.site.com/yughggcfgh">
<img width="11" height="15" src="https://www.site.com/yughggcfghcvbcvb">
<img width="10" height="12" src="https://www.site.com/a.jpg">
<img width="10" height="12" src="https://www.site.com/b.gif">';

$xml = new DOMDocument();
@$xml->loadHTML($somesource);
foreach($xml->getElementsByTagName('img') as $img) {
    if(substr($img->getAttribute('src'),-4,1)!='.'){
        $image[] = $img->getAttribute('src');
    }
}

print_r($image);

Array
(
    [0] => https://www.site.com/yughggcfgh
    [1] => https://www.site.com/yughggcfghcvbcvb
)

?>
Sign up to request clarification or add additional context in comments.

5 Comments

Maybe be more restrictive as to what constitutes an image extension
I noted its an example, im not here to complete scripts for peeps but only give examples to build appon ;p
Thanks it works ! is domDocument will work on every PHP server ??
@Alfredfrancis Yes any server with PHP 5. Dont forget to scan through the documentation.
@LawrenceCherone: well… fair enough ;)
1

Regular expressions are probably not the right tool for the job, but here you go …

You should be able to achieve your goal with negative lookbehind assertions:

preg_match_all('/src=".+?(?<!\.jpg|\.jpeg|\.gif|\.png)"/', $html, $matches);

1 Comment

error : Warning: preg_match() [function.preg-match]: Delimiter must not be alphanumeric or backslash in C:\xampp\htdocs\curl\index.php on line 19

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.