1

I want to download images from a web page, for example, www.yahoo.com, and store it in a folder using PHP.

I am getting the page source using file_get_contents() and extracting the img src tag. I am passing this src to cURL code. The code does not give any error, but the images are not getting downloaded. Please check out the code. I am not getting where I am going wrong.

<?php
    $html = file_get_contents('www.yahoo.com');
    $ptn = '/< *img[^>]*src *= *["\']?([^"\']*)/i';
    preg_match_all($ptn, $html, $matches, PREG_PATTERN_ORDER);
    $seq = 1;
    foreach($matches as $img)
    {
        $fp = fopen("root/Images/image_$seq.jpg", 'wb');
        $ch = curl_init ($img);
        curl_setopt($ch,CURLOPT_FILE, $fp);
        curl_setopt($ch,CURLOPT_URL, $img);
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
        $image = curl_exec($ch);
        curl_close($ch);
        fwrite($fp, $image);
        fclose($fp);
        $seq++;
    }
    echo "IMAGES DOWNLOADED";
?>
0

3 Answers 3

1
foreach($matches as $img)

should be changed to

foreach($matches[1] as $img)

BTW: you should replace the file_get_contents by cURL, it's about 3x as fast;)

Sign up to request clarification or add additional context in comments.

Comments

0
  • Is $img the full URL of the image?
  • Is the image protected (use referer)?

    $image = false;
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_REFERER,$url);
    curl_setopt($ch, CURLOPT_URL, $url );
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 7);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch,CURLOPT_ENCODING,gzip);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
    $image = curl_exec ($ch);
    

Try debugging first.

First try it with a single image from Yahoo, http://www.depers.nl/beeld/w100/2011/201105/20110510/anp/sport/img-100511-349.onlinebild.jpg.

Also, why use file_get_contents and curl? Use curl instead.

  1. Make a function for cURL: function simple_curl ( $url,$binary=false){ set your cURL vars, return curl_exec).
  2. Get yahoo.com: $result = simple_curl($url);
  3. Get links with the pattern (check if the matches contains the full URL ( domain + directory + file ).
  4. Loop each pattern match (don't forget: multi array!! So loop on $matches[1]).
  5. curl binary file and save it: $image = simple_curl($match,true);

Comments

0
  • www.yahoo.com is not a URL, http://www.yahoo.com/ is.
  • $img is an array you need to iterate $matches[1]
  • You both tell cURL to write to a file and retrieve the result. Use one.

I don't know how you don't see errors. I would look into that. Copying and pasting and then running it gave me plenty of errors.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.