Let’s get one thing out in the open. Curl is sweet. It does it’s job very well, and I’m absoutely thrilled it exists.

If you’re using curl in your PHP app to make web requests, you’ve probably realized that by doing them one after the other, the total time of your request is the sum of all the requests put together. That’s lame.

Unfortunately using the curl_multi_exec is poorly documented in the PHP manual.

Let’s say that your app is hitting APIs from these servers:

Google: .1s
Microsoft: .3s
rustyrazorblade.com: .5s

Your total time will be .9s, just for api calls.

By using curl_multi_exec, you can execute those requests in parallel, and you’ll only be limited by the slowest request, which is about .5 sec to rustyrazorblade in this case, assuming your download bandwidth is not slowing you down.

Sample code:

$nodes = array('http://www.google.com', 'http://www.microsoft.com', 'http://www.rustyrazorblade.com');
$node_count = count($nodes);

$curl_arr = array();
$master = curl_multi_init();

for($i = 0; $i < $node_count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init($url);
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

echo "results: ";
for($i = 0; $i < $node_count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	echo( $i . "\n" . $results . "\n");
}
echo 'done';

It’s really not documented on php.net how to use curl_multi_getcontent, so hopefully this helps someone.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Reddit
 

78 Responses to Executing multiple curl requests in parallel with PHP and curl_multi_exec

  1. techiegroups says:

    You are right, it is sad the lack of documentation about this on the php.net website. I’m glad you put a link to this guide as a comment.

  2. You forgot to close your handles I think….

    //close the handles

    inside the last for loop:
    curl_multi_remove_handle($curl_arr[$i]);

    after the last for loop:
    curl_multi_close($master);

    also see the PHP manual:
    http://cn.php.net/manual/en/function.curl-multi-init.php

  3. small mistake, it should read:

    curl_multi_remove_handle($master, $curl_arr[$i]);

  4. jon says:

    Hey Thijs,
    I’m pretty sure it’s right as is. The for loop is to add each of the curl handles to the multihandle. If I removed them, it wouldn’t work. Try my above code, it’ll work.

  5. Hi Jon,

    I mean it’s better to close the handles after you use them (of course not before).

    Something like this:

    [code]
    for($i = 0; $i < $node_count; $i++)
    {
    $results = curl_multi_getcontent ( $curl_arr[$i] );
    echo( $i . "\n" . $results . "\n");
    url_multi_remove_handle($curl_arr[$i]);
    }
    url_multi_remove_handle($master, $curl_arr[$i]);
    echo 'done';

    [/code]

  6. Hi Jon,

    I mean it’s better to close the handles after you use them (of course not before).

    Something like this:

    [code]
    for($i = 0; $i < $node_count; $i++)
    {
    $results = curl_multi_getcontent ( $curl_arr[$i] );
    echo( $i . “\n” . $results . “\n”);
    url_multi_remove_handle($master, $curl_arr[$i]);
    }
    curl_multi_close($master);
    echo ‘done’;

    [/code]

    ps. please disregard the previous comment from me. Difficult to copy/paste code here ;)

  7. sean says:

    hi, quick question: what would be the most efficient way to use cURL to grab a page AND the header, but only display the page to the users browser. i would ideally like to do this by only opening one cURL session. currently i have the following code, which, as you can see, is intended to pass all $_GET and $_POST info, but currently has no mechanism to keep track of cookies:

    $target_domain = ‘http://targetdomain.com/fowlder/page.cfm';
    $ch = curl_init();
    curl_setopt ($ch, CURLOPT_URL, “{$target_domain}?{$_SERVER["QUERY_STRING"]}”);
    curl_setopt ($ch, CURLOPT_HEADER, true); // this displays header info on users browser but i really want to just load it into variables
    curl_setopt ($ch, CURLOPT_POST, true);
    curl_setopt ($ch, CURLOPT_POSTFIELDS, $_POST);
    curl_exec ($ch);
    curl_close ($ch);

  8. jon says:

    Hey Sean,
    I poked around the PHP site, and there’s actually someone who wrote a function to do this.

    Give it a try, please let me know if if works.

    http://us3.php.net/manual/en/function.curl-setopt.php#42009

  9. jon says:

    All you need is this part, btw:

    list($response_headers,$response_body) = explode(“\r\n\r\n”,$r,2);

    Enjoy.

  10. george says:

    helo,
    I need an advice please:
    i try to connect to a website, using curl, but its require javascript and i am not able to see what i need. there is exist other solution to this ?
    regards

  11. jon says:

    You’ll have to figure out which javascript file you’ll need, then run your code though a javascript interpreter. To be honest, I’ve never done it, and I have no idea how good they are.

    Here’s 1 project:
    http://j4p5.sourceforge.net/

  12. frank says:

    @george:

    In the past, when faced with that problem, I ended up writing a few regular expressions to get the data I needed directly out of the javascript files.

    Depending on what you’re trying to achieve, you may find this type of approach simplest.

  13. juan says:

    mi problem is:

    first option is login whith => http://192.168.1.100:8088/asterisk/manager?action=login&username=admin&secret=123

    and second option is => http://192.168.1.100:8088/asterisk/rawman?action=sipeers

    but the secondo option no succes,

    navegator say: Response: Error Message: Authentication Required

    WHY???????????????

    if I’m logeado

  14. Joe Lencioni says:

    Fantastic! I came here from the PHP documentation and this is exactly what I was looking for. And, I agree with Thijs–you should close your handles after you are done using them.

  15. neor says:

    not nice with pages which has moved
    (301, 302 http error)

  16. arpo says:

    Hi all
    I need a quick advise: I have a program crawling many pages from the same server with many file_get_contents() calls. Do you think I would get some improvement by replacing them with just a curl_multi_exec() call to retrieve them all at once, or it’ll probably take the same time since they all come from the same server?

  17. jon says:

    If you have the bandwidth, it’ll be faster to use curl_multi_exec.

  18. CoLiq says:

    Anyone know how to catch redirect pages?

    I have problem like this:

    URL1 -> URL2 -> URL3

    -> mean redirect

    So I want to get URL3 content only with entering url 1. Anyone know How to solve this problem?

    Thx..

  19. dan says:

    Coliq, try this:

    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);

  20. Edwin says:

    Just thinking out loud….
    Wouldn’t it be an idea to use some ajax to load each api data on it’s own which will mean in this case the first response would be at .1 from Google. I mean put the cURL part in a function on a different php file. call the php file in a div with the proper variables with an onload function.

    just my thoughts and nothing concrete (yet)

  21. jon says:

    Yes, you can execute multiple requests through ajax, but what if you’re not using PHP through the web, or the calls you’re making require a secret api key?

  22. Edwin says:

    if it’s only for your own (as i asume that’s what you mean with not thru the web) ..then i do not seed that much need to worry about about performance or neet looking ajax gif’s telling there’s something going on. I would use your way then. The ajax thought came to me as i was reading your post and thinking about ways to handle this.

  23. Ross says:

    Hi Jon,

    I’m trying to fill in the documentation gaps in cURL – I hope it’s okay to include something similar to this example?

    Thanks,
    Ross

  24. jon says:

    Sure. Feel free to link to this page as well.

  25. Mike Adolphs says:

    Damn, that saved me a lot of time doing nasty API calls and writing them in a temporary file instead of polling the API directly.

    Thanks for the good work!

  26. Jeff says:

    Thank you very much for this. Do you know how many URLs curl_multi_exec() can handle at once and not crash my application? Is it just dependent on my Memory?

  27. jon says:

    Hey Jeff,
    I haven’t had a chance to try to max out curl_multi_exec. I suspect you’ll hit a bottleneck on your network before you’ll run out of memory.

  28. Jeff says:

    Thanks for your reply Jon. Just like you said, I tried it with 20 URLs at once and it crashed Apache.

  29. Dave says:

    Jeff, there must be something whacky with your Apache installation. I’m doing something fairly similar to the above code, and when I clear the entire cache, it calls out to well over 20 URLs. I’ve yet to see Apache (2.2.6 – standard Slackware binary package) crash on it. (2.2.8 and later crash on me even without calling out to 20+ URLs, curiously. I’m pretty sure it’s just a packaging screwup, though, as I’ve tested a locally compiled 2.2.10 without hitting any crashes on cURL.)

    Cheers,
    – Dave

  30. Simon says:

    I thought this was a great piece of code, and it was very useful. However, I think I can improve on it.
    In the original example, you showed how to reduce the wait from 0.9 to 0.5 seconds. The question arises that if 1 website is very slow, you could hold up your app for a long time. It would be great if we could work on the returned files as soon as it was returned, rather than wait for ALL files in the “multi handle”. The function (and its call) should echo out as soon as a site is downloaded, then after all pages are downloaded should return an array in the same order as the input $nodes.

    function getMultipleDocuments($nodes, $referer){
    set_time_limit(90);
    if(!$referer){
    $referer = $nodes[0];
    }
    $node_count = count($nodes);

    $curl_arr = array();
    $master = curl_multi_init();

    for($i = 0; $i < $node_count; $i++)
    {
    $curl_arr[$i] = curl_init($nodes[$i]);
    curl_setopt($curl_arr[$i], CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($curl_arr[$i],CURLOPT_FRESH_CONNECT,true);
    curl_setopt($curl_arr[$i],CURLOPT_CONNECTTIMEOUT,10);
    curl_setopt($curl_arr[$i],CURLOPT_RETURNTRANSFER,true);
    curl_setopt($curl_arr[$i],CURLOPT_REFERER,$referer);
    curl_setopt($curl_arr[$i],CURLOPT_TIMEOUT,30);

    curl_multi_add_handle($master, $curl_arr[$i]);
    }
    $previousActive = -1;
    $finalresult = array();
    $returnedOrder = array();
    do{
    curl_multi_exec($master, $running);
    if($running !== $previousActive){
    $info = curl_multi_info_read($master);
    if($info['handle']){
    $finalresult[] = curl_multi_getcontent($info['handle']);
    $returnedOrder[] = array_search($info['handle'], $curl_arr, true);
    curl_multi_remove_handle($master, $info['handle']);
    curl_close($curl_arr[end($returnedOrder)]);
    echo ‘downloaded ‘.$nodes[end($returnedOrder)].’. We can process it further straight away, but for this example, we will not.’;
    ob_flush();flush();
    }
    }
    $previousActive = $running;
    }while($running > 0);
    curl_multi_close($master);

    set_time_limit(30);
    return array_combine($returnedOrder, $finalresult);
    }

    $nodes = array(‘http://mediumSpeedSite.org', ‘http://fastSpeedSite.com', ‘http://quiteSlowSite.com‘);
    $returnedDocs = getMultipleDocuments($nodes, null);

  31. simmeh says:

    thanks for posting this, there is little to no documentation on the php.net website, this cleared things up

  32. Josh Fraser says:

    Thanks for sharing. I made some modifications so that each request is processed as soon as it completes. I’ve found that make things a lot faster particularly when you’re dealing with a large number of requests:

    http://onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/

  33. macrunnign says:

    How does one speed up the time it takes for a CURL script to execute. Currently I am trying to execute 1 URL. I am told it is only taking .2 seconds on the other end for the server to respond but it is taking roughly 20 seconds for the CURL script to fully execute and return a response to me. Any ideas?

  34. jon says:

    What else are you doing on the page?

    The simplest way to find out what’s taking a long time is to time the various parts of the page using microtime(true). If you want to be more advanced, look into xdebug and kcachegrind (or webgrind)

    Check it out here: http://www.rustyrazorblade.com/2007/07/26/php-setting-up-xdebug-with-kcachegrind/

  35. macrunnign says:

    At this point I am just running the curl execution without anything else going on. It is a test page at the moment. I am trying to work out a solution for another problem. Would you happen to know how I capture the response from the other server. I am not sure as to how to load it into a variable. Currently I am getting the response but haven’t figured out what I am suppose to use to capture the other servers response.
    thanks.

  36. jon says:

    to have the results returns by curl_exec, you need to use this:

    curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);

  37. macrunnign says:

    I’ve got the response already I just am not sure how to load it into a variable that I can then use to manipulate the page. Are you saying $curl_handle will be my variable?
    So that if I want to create an if() statement I would use if($curle_handle = “response text”) { do this}

    ???

  38. jon says:

    You’d do this to get the response into a variable:

    $curl_handle = curl_init(“http://whatever.com”);
    curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
    $response = curl_exec($curl_handle);

    If you’re worried about timeout, you can also do:

    curl_setopt($curl_handle, CURLOPT_TIMEOUT, 2);

    Last param is seconds.

    You should get familiar with the curl options here: http://us2.php.net/manual/en/function.curl-setopt.php

    Good luck,
    Jon

  39. macrunnign says:

    I have tried this but I’m not sure I am getting what I need. Here is what i have so far. Maybe you can see if I messed up somewhere:
    [code]
    $myVin = $_GET['vin'];
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_URL, "http://socket.somewebsite.com:8080/?UID=C412012&REQUEST=INV&VIN=$myVin&INV_DATE=N&ONE_OWNER=Y");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $result1 = curl_exec($ch);
    if($result1['curlopt_returntransfer'] == "Yes 1") {
    echo "We have a Winner!";
    } else {
    echo "Not a One Owner";
    }
    curl_close($ch);
    [/code]

  40. macrunnign says:

    P.S. I’m not worried about timeout. Thanks. I am more concerned with the amount of time it is taking to send and receive the response though.

  41. macrunnign says:

    Ok. So I am confused now. I’ve got the code I mentioned above working. The problem seems to be that the variable $result1 does not want to work properly in my if statement.
    I believe it should be written so but please let me know if I am wrong.

    if($result1 == “Yes 1″) {
    echo “We have a Winner!”;
    }
    if(result1 == “Yes N”) {
    echo “Not a One Owner”;
    }

    For some reason my if statement does not appear to be working. Am I missing something on this?

  42. macrunnign says:

    Oh yah, ive been poring over the manual on php.net as you mentioned earlier but still cannot figure out for the life of me why my variable is not being excepted. From what I gather my variable $result1 should read either “Yes N” or “Yes 1″ but when I try to pass it to my if statement the if statement does not seem to be recognizing the variable at all. I can echo $result1 onto the page but something seems amiss for the if statement not to be recognizing this.

  43. jon says:

    I tried to hit that URL, it never gave a result. The problem is the remote site. Try it in your browser first.

  44. macrunnign says:

    Hi Jon, I just put a dummy url in this forum for security purposes. I think I’ve got it figured out. I just wish I could find something that would execute quicker. Total time it takes for this to send, receive and execute based on the response is 20 seconds. Way to long at the moment. I need it to be 1 or 2 seconds most.

  45. jon says:

    You’re being pretty vague… I’ve suggested ways to find out what’s taking up so much time.

    I can’t really help you for free anymore – you might want to look to a user group or consultant to help you with this. Good luck.

  46. macrunnign says:

    oh. sorry to take up your time. I thought this was a free forum. my apologies.

  47. jon says:

    It is, but you aren’t taking the advice I’ve given to isolate the code that’s taking 20 seconds.

  48. Stern87 says:

    I use curl in my scripts. Your sample is great. But I got a problem: for example, on the HTML-page I have 10 links, in every link curl downloading
    http://www.example.com/?page=1 in first link
    http://www.example.com/?page=2 in second link

    http://www.example.com/?page=10 in tenth link

    If you quickly open each link (of HTML-page) in new tabs, then you’ll see the funny thing: while one tab (copy of running script) not completed – the next copy of the script will not run.
    This problem with “curl” and “multi-curl”(((

    Please help. How fix that?

  49. jon says:

    Are you sure it’s not just your browser being throttled? Try loading the URL from a second machine and see if you still get that delay.

  50. Stern87 says:

    I’m sure. From the second machine – it’s ok, but if you open next tab, next…. you’ll get the same funny problem.
    This is very sad.

    Here is simple example for an experiment.
    test.php:
    ===============================================
    <?
    if (isset($_GET["link"])) {
    $link = “http://nehe.gamedev.net/lesson.asp?index=0″.$_GET["link"];
    $nodes = array();
    array_push($nodes, $link);
    $node_count = count($nodes);

    $curl_arr = array();
    $master = curl_multi_init();

    for($i = 0; $i 0);

    echo “results: “;
    for($i = 0; $i
    Link 1
    Link 2
    Link 3
    Link 4
    Link 5
    Link 6
    Link 7
    Link 8
    Link 9
    ===============================================

    I’m using firefox browser, so if we will open all links in new tabs (Middle Click) rapidly – and you will see what I mean.
    While curl’s code block of first executing script will undone – next executing script will not start downloading. Why??? How can I fix it?

    Thanks again!
    I appreciate you, Jon.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>