2

I am trying to fetch the header info from multiple webpages. I tried to do so using single cURL requests using the code shown below :

<?php
$arr = array(
        "John", "Mary",
        "William", " Peter",
        "James", "Emma",
        "George", "Elizabeth",
        "Charles", "Margaret",
    );

$ch = curl_init();    

for($i=0; $i<sizeOf($arr); $i++){
    $url = "https://example.com/".$arr[$i];
    $options = array(
    CURLOPT_URL            => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HEADER         => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_ENCODING       => "",
    CURLOPT_SSL_VERIFYPEER => FALSE,
    CURLOPT_AUTOREFERER    => true,
    CURLOPT_CONNECTTIMEOUT => 120,
    CURLOPT_TIMEOUT        => 120,
    CURLOPT_MAXREDIRS      => 10,
);

curl_setopt_array( $ch, $options );
    $response = curl_exec($ch); 
    $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    if ( $httpCode != 200 ){
        echo $arr[$i]." Error<br>";
    } else {
        echo $arr[$i]." Success<br>";
    }
}

curl_close($ch);
?>

But this code seems to take a very long execution time. I searched the internet & found curl_multi_exec which could be used to run multiple cURL requests at a time. So now I use this code :

  <?php
ini_set('max_execution_time', 0);

$arr = array(
    "John", "Mary",
    "William", " Peter",
    "James", "Emma",
    "George", "Elizabeth",
    "Charles", "Margaret",
);

function multiRequest($data) {

  // array of curl handles
  $curly = array();
  // data to be returned
  $result = array();

  // multi handle
  $mh = curl_multi_init();

  // loop through $data and create curl handles
  // then add them to the multi-handle
  foreach ($data as $id => $d) {

    $curly[$id] = curl_init();

    $url = "https://example.com/".$data[$id];


    $options = array(
      CURLOPT_URL            => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HEADER         => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_ENCODING       => "",
    CURLOPT_SSL_VERIFYPEER => FALSE,
    CURLOPT_AUTOREFERER    => true,
    CURLOPT_CONNECTTIMEOUT => 120,
    CURLOPT_TIMEOUT        => 120,
    CURLOPT_MAXREDIRS      => 10,
);

    // extra options?
    if (!empty($options)) {
      curl_setopt_array($curly[$id], $options);
    }

    curl_multi_add_handle($mh, $curly[$id]);
  }

  // execute the handles
  $running = null;
  do {
    curl_multi_exec($mh, $running);
  } while($running > 0);


  // get content and remove handles
  foreach($curly as $id => $c) {
    $result[$id] = curl_multi_getcontent($c);
    //Code to fetch header info
    curl_multi_remove_handle($mh, $c);
  }

  // all done
  curl_multi_close($mh);

  return $result;
}

multiRequest($arr);

?>

How to fetch multiple header_info from curl_multi_init HTTP request?

1 Answer 1

1

This code from your first example:

$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ( $httpCode != 200 ){
    echo $arr[$i]." Error<br>";
} else {
    echo $arr[$i]." Success<br>";
}

will work even if the curl handle was executed by curl_multi_exec().


In your second example, replace this code:

// get content and remove handles
foreach($curly as $id => $c) {
    $result[$id] = curl_multi_getcontent($c);
    //Code to fetch header info
    curl_multi_remove_handle($mh, $c);
}

with this:

// get content and remove handles
foreach($curly as $id => $c) {

    $result[$id] = curl_multi_getcontent($c);

    $httpCode = curl_getinfo($c, CURLINFO_HTTP_CODE);
    $url      = curl_getinfo($c, CURLINFO_EFFECTIVE_URL);

    if ( $httpCode != 200 ){
        echo $url." Error<br>";
    } else {
        echo $url." Success<br>";
    }

    curl_multi_remove_handle($mh, $c);

}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.