7

The following is my working code sample. Just add your own sleep.php which will sleep($_GET['sleep']);

class MultiCurl {
    private $mc;
    private $running;
    private $execStatus;

    public function __construct() {
        $this->mc = curl_multi_init();
    }

    public function addCurl($ch) {
        $code = curl_multi_add_handle($this->mc, $ch);

        if ($code === CURLM_OK || $code === CURLM_CALL_MULTI_PERFORM) {
            do {
                $this->execStatus = curl_multi_exec($this->mc, $this->running);
            } while ($this->execStatus === CURLM_CALL_MULTI_PERFORM);

            return $this->getKey($ch);
        }
        return null;
    }

    public function getNextResult() {
        if ($this->running) {
            while ($this->running && ($this->execStatus == CURLM_OK || $this->execStatus == CURLM_CALL_MULTI_PERFORM)) {
                usleep(2500);
                curl_multi_exec($this->mc, $this->running);

                $responses = $this->readResponses();
                if ($responses !== null) {
                    return $responses;
                }
            }
        } else {
            return $this->readResponses();
        }

        return null;
    }

    private function readResponses() {
        $responses = [];
        while ($done = curl_multi_info_read($this->mc)) {
            $key = $this->getKey($done['handle']);

            $done['response'] = curl_multi_getcontent($done['handle']);
            $done['info'] = curl_getinfo($done['handle']);
            $error = curl_error($done['handle']);
            if ($error) {
                $done['error'] = $error;
            }

            $responses[$key] = $done;

            curl_multi_remove_handle($this->mc, $done['handle']);
            curl_close($done['handle']);
        }

        if (!empty($responses)) {
            return $responses;
        }

        return null;
    }

    private function getKey($ch) {
        return (string)$ch;
    }
}

function getHandle($url) {
    $ch = curl_init();
    curl_setopt_array($ch, [
        CURLOPT_URL => $url,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_CONNECTTIMEOUT => 5
    ]);
    return $ch;
}

$totalTime = microtime(true);

$multi = new MultiCurl();

$keys = [];
$addCurlHandles = microtime(true);
$keys[] = $multi->addCurl(getHandle('http://localhost/sleep.php?sleep=5'));
for ($i = 0; $i < 5; $i++) {
    $keys[] = $multi->addCurl(getHandle('http://localhost/sleep.php?sleep=' . random_int(1, 4)));
}
echo 'Add curl handles: ' . (microtime(true) - $addCurlHandles) . "\n";

/**/
$loop = microtime(true);
while (microtime(true) - $loop < 2) {
    usleep(100);
}
echo 'Loop: ' . (microtime(true) - $loop) . "\n";
/**/

$getResults = microtime(true);
while ($result = $multi->getNextResult()) {
    foreach ($result as $key => $response) {
        echo $response['response'] . "\n";
    }
}
echo 'Get results: ' . (microtime(true) - $getResults) . "\n";

echo 'Total time: ' . (microtime(true) - $totalTime) . "\n";

Now play around with the for loop calling $multi->addCurl. When I add 4 handles, the output is something like

Add curl handles: 0.0007021427154541
Loop: 2.0000491142273
Slept 1
Slept 3
Slept 3
Slept 4
Slept 5
Get results: 5.0043671131134
Total time: 7.0052678585052

But when I add 5 or more, the output is

Add curl handles: 0.0014941692352295
Loop: 2.00008893013
Slept 1
Slept 2
Slept 4
Slept 4
Slept 4
Slept 5
Get results: 3.0007629394531
Total time: 5.0025300979614

As you can see, the later does more work but finishes faster because the 5 second sleep request was actually sent before the 2 second while loop started working.

With the smaller number of handles, calling curl_multi_select and curl_multi_exec in a loop until curl_multi_select doesn't return -1 has resolved this but it's very unreliable. It doesn't work at all on another computer and will sometimes get stuck with curl_multi_select always returning -1.

10
  • I am not too sure what exactly is you trying to resolve here. Your script seems to work as expected. I can see there are three questions. Firstly, "How am I supposed to know the request has been sent and" , for which I would say, since you are sending the request you should know which URLs have been requested already. I am not too sure when you say "it's safe to start executing arbitrary code" and also with your third question "With the smaller number of handles, .. but not under high load" Commented Feb 19, 2018 at 10:44
  • I need to run code while waiting for the response(s), this doesn't work in my example with 4 handles. Commented Feb 19, 2018 at 11:44
  • Under high load (~50 scripts similar to the example running in parallel under php-fpm), curl_multi_select will always return -1, no matter what. Commented Feb 19, 2018 at 11:46
  • Do you want to use curl_multi_select only or you are open to use other better libraries? Commented Feb 19, 2018 at 14:14
  • The question is about curl specifically. But if something else actually works, I'm willing to try it. Commented Feb 19, 2018 at 14:45

1 Answer 1

1

I have found a hack solution. It is to check pretransfer_time using curl_getinfo.

I have published the source code on github: https://github.com/rinu/multi-curl

Better, cleaner solutions are still very welcome.

Sign up to request clarification or add additional context in comments.

1 Comment

idk about better cleaner solutions, but i go with a proc_open approach when i really need to in php. and that's not clean at all, but it works a lot better than curl_multi in php, which has always given me problems.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.