3

If you just enter the urls into the browser you can see that both work, cdon works even without javascript, have they blocked cURL somehow?

I'm trying to build a scraper to benifit legal movies online which would benifit them a whole lot, seems stupid blocking scrapers in general imho. Although I'm far from sure that's whats going on here! Might be just an error somewhere..

// Works
get_file1('http://sfanytime.com/sv-SE/Sokresultat/?field=all&q=The+Matrix', '/', 'sfanytime.html');

// Saves a blank 0 KB file
get_file1('http://downloads.cdon.com/index.phtml?action=search&search_terms=The+Matrix', '/', 'cdon.html');

function get_file1($file, $local_path, $newfilename) {
    $out = fopen($newfilename, 'wb');
    if ($out === FALSE) {
        return false;
    }      
    $ch = curl_init();             
    curl_setopt($ch, CURLOPT_FILE, $out);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_URL, $file);                  
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

    curl_exec($ch);
    $error = curl_error($ch);
    if (strlen($error) > 0) {
            echo "<br>Error is : ". $error;
        return false;
    }
    curl_close($ch);
    return true;
}
3
  • I tested your code it returned: bool(true) bool(true) Commented Dec 1, 2011 at 12:10
  • Remove curl_setopt($ch, CURLOPT_FAILONERROR, true); Commented Dec 1, 2011 at 12:11
  • @Oyeme Okay good to know but that doesn't really explain why it doesn't actually save the webpage.. DaveRandom Done but it didn't change anything for me.. Commented Dec 1, 2011 at 13:40

1 Answer 1

3

You should change the line

curl_setopt($ch, CURLOPT_FAILONERROR, true);

...to...

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

CURLOPT_FAILONERROR will cause a "silent fail" - which from what you say, is not what you want. I have replaced this with CURLOPT_FOLLOWLOCATION, because when I visit the second URL, I get redirected to a "choose your country" type page, which will be a response with an empty body - which is why you get an empty file.

There is no problem with your code as such, simply a problem with the way you handle the response from the second URL. You don't see an error because, technically, there wasn't one.

Sign up to request clarification or add additional context in comments.

7 Comments

Thanks! Still becomes an empty file though.. even with that changed!
@OZZIE hmmm... works for me, can you update the question with your current code?
Can you send preffered language via cURL ?
It depends how the site is setup, I imagine it uses cookies. Hang on, let me have a play with it.
Hmm okay well I didn't solve it but at least now I get a response: "HTTP/1.1 302 Found Date: Thu, 01 Dec 2011 13:49:51 GMT Server: Apache Set-Cookie: cdon=s4XKhFniWFPlY; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Location: flagpage.php X-Server: 1 Content-Length: 0 Connection: close Content-Type: text/html; charset=iso-8859-1 "
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.