0

I have read many question regarding the title. Basically I'm using combination of getheader and curl to check wether a url is exist.

$url = "http://www.asdkkk.com";
$headers = get_headers($url);  

if(strpos($headers[0],'404') === false){

    $ch = curl_init($url); 
    curl_setopt_array($ch,array(
                            CURLOPT_HEADER => true,
                            CURLOPT_RETURNTRANSFER => true,
                            CURLOPT_FOLLOWLOCATION => true,
                            CURLOPT_SSL_VERIFYPEER => false,
                            CURLOPT_HTTPHEADER     => array("Accept-Language: en-US;q=0.6,en;q=0.4"),
                            CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.6 (KHTML, like Gecko) Chrome/16.0.897.0 Safari/535.6'  
                           ));
    $data = curl_exec($ch); 
    $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    if($httpCode != 404){
        curl_close($ch); 
        return $data;
    }
}else{
  echo "URL Not Exists";
}

Both function will return status code 200 for the url("http://www.asdkkk.com"). In the url is a page not found website. But it seem like it is hosted and the header of the page doesn't set to 404. I have try out not only this url but others too. So how can I determine a URL is actually existence in a very accurate way?

7
  • possible duplicate of How can I check if a URL exists via PHP? Commented Dec 8, 2014 at 19:45
  • It wasn't I have read this question before. @castis Commented Dec 8, 2014 at 19:47
  • 2
    If the website displays a "404" message even when it serves up a response code of 200, then it is website that is not behaving properly. You might need to actually parse the response content itself to determine if it is a "404". Commented Dec 8, 2014 at 19:48
  • What now I mean is the URL no matter what will eventually return 200 in a code Commented Dec 8, 2014 at 19:48
  • @MikeBrant can you give me an example? Or some sort of article,question? And thank for help =D Commented Dec 8, 2014 at 19:50

1 Answer 1

1

I think the issue with your example code is you are confusing a 404 HTTP response code for 'Not Found' from a server with the case of a URL that doesn't point to any server at all. If there's no server response at all, cURL will return '0' as the HTTP response, rather than 404. Try running the below code and see if it works for your purposes:

$urls = array(
    "http://www.asdkkk.com",
    "http://www.google.com/cantfindthisurl",
    "http://www.google.com",
);
$ch = curl_init();
foreach($urls as $url){
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_exec($ch);
    $http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    echo "$http_status for $url <br>";
}
Sign up to request clarification or add additional context in comments.

1 Comment

Kindly note the CURLOPT_SSL_VERIFYPEER option which also verify the URL's starting with HTTPS, so curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.