Executing multiple curl requests in parallel with PHP and curl_multi_exec

February 20, 2008 – 4:17 pm

Let’s get one thing out in the open. Curl is sweet. It does it’s job very well, and I’m absoutely thrilled it exists.

If you’re using curl in your PHP app to make web requests, you’ve probably realized that by doing them one after the other, the total time of your request is the sum of all the requests put together. That’s lame.

Unfortunately using the curl_multi_exec is poorly documented in the PHP manual.

Let’s say that your app is hitting APIs from these servers:

Google: .1s
Microsoft: .3s
rustyrazorblade.com: .5s

Your total time will be .9s, just for api calls.

By using curl_multi_exec, you can execute those requests in parallel, and you’ll only be limited by the slowest request, which is about .5 sec to rustyrazorblade in this case, assuming your download bandwidth is not slowing you down.

Sample code:

$nodes = array('http://www.google.com', 'http://www.microsoft.com', 'http://www.rustyrazorblade.com');
$node_count = count($nodes);

$curl_arr = array();
$master = curl_multi_init();

for($i = 0; $i < $node_count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init($url);
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

echo "results: ";
for($i = 0; $i < $node_count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	echo( $i . "\n" . $results . "\n");
}
echo 'done';

It’s really not documented on php.net how to use curl_multi_getcontent, so hopefully this helps someone.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Reddit
  1. 55 Responses to “Executing multiple curl requests in parallel with PHP and curl_multi_exec”

  2. You are right, it is sad the lack of documentation about this on the php.net website. I’m glad you put a link to this guide as a comment.

    By techiegroups on Feb 24, 2008

  3. You forgot to close your handles I think….

    //close the handles

    inside the last for loop:
    curl_multi_remove_handle($curl_arr[$i]);

    after the last for loop:
    curl_multi_close($master);

    also see the PHP manual:
    http://cn.php.net/manual/en/function.curl-multi-init.php

    By Thijs (Shenzhen) on Apr 2, 2008

  4. small mistake, it should read:

    curl_multi_remove_handle($master, $curl_arr[$i]);

    By Thijs (Shenzhen) on Apr 2, 2008

  5. Hey Thijs,
    I’m pretty sure it’s right as is. The for loop is to add each of the curl handles to the multihandle. If I removed them, it wouldn’t work. Try my above code, it’ll work.

    By jon on Apr 3, 2008

  6. Hi Jon,

    I mean it’s better to close the handles after you use them (of course not before).

    Something like this:

    [code]
    for($i = 0; $i < $node_count; $i++)
    {
    $results = curl_multi_getcontent ( $curl_arr[$i] );
    echo( $i . “\n” . $results . “\n”);
    url_multi_remove_handle($curl_arr[$i]);
    }
    url_multi_remove_handle($master, $curl_arr[$i]);
    echo ‘done’;

    [/code]

    By Thijs (Shenzhen) on Apr 4, 2008

  7. Hi Jon,

    I mean it’s better to close the handles after you use them (of course not before).

    Something like this:

    [code]
    for($i = 0; $i < $node_count; $i++)
    {
    $results = curl_multi_getcontent ( $curl_arr[$i] );
    echo( $i . “\n” . $results . “\n”);
    url_multi_remove_handle($master, $curl_arr[$i]);
    }
    curl_multi_close($master);
    echo ‘done’;

    [/code]

    ps. please disregard the previous comment from me. Difficult to copy/paste code here ;)

    By Thijs (Shenzhen) on Apr 4, 2008

  8. hi, quick question: what would be the most efficient way to use cURL to grab a page AND the header, but only display the page to the users browser. i would ideally like to do this by only opening one cURL session. currently i have the following code, which, as you can see, is intended to pass all $_GET and $_POST info, but currently has no mechanism to keep track of cookies:

    $target_domain = ‘http://targetdomain.com/fowlder/page.cfm’;
    $ch = curl_init();
    curl_setopt ($ch, CURLOPT_URL, “{$target_domain}?{$_SERVER["QUERY_STRING"]}”);
    curl_setopt ($ch, CURLOPT_HEADER, true); // this displays header info on users browser but i really want to just load it into variables
    curl_setopt ($ch, CURLOPT_POST, true);
    curl_setopt ($ch, CURLOPT_POSTFIELDS, $_POST);
    curl_exec ($ch);
    curl_close ($ch);

    By sean on Apr 14, 2008

  9. Hey Sean,
    I poked around the PHP site, and there’s actually someone who wrote a function to do this.

    Give it a try, please let me know if if works.

    http://us3.php.net/manual/en/function.curl-setopt.php#42009

    By jon on Apr 15, 2008

  10. All you need is this part, btw:

    list($response_headers,$response_body) = explode(”\r\n\r\n”,$r,2);

    Enjoy.

    By jon on Apr 15, 2008

  11. helo,
    I need an advice please:
    i try to connect to a website, using curl, but its require javascript and i am not able to see what i need. there is exist other solution to this ?
    regards

    By george on Apr 15, 2008

  12. You’ll have to figure out which javascript file you’ll need, then run your code though a javascript interpreter. To be honest, I’ve never done it, and I have no idea how good they are.

    Here’s 1 project:
    http://j4p5.sourceforge.net/

    By jon on Apr 15, 2008

  13. @george:

    In the past, when faced with that problem, I ended up writing a few regular expressions to get the data I needed directly out of the javascript files.

    Depending on what you’re trying to achieve, you may find this type of approach simplest.

    By frank on Apr 24, 2008

  14. mi problem is:

    first option is login whith => http://192.168.1.100:8088/asterisk/manager?action=login&username=admin&secret=123

    and second option is => http://192.168.1.100:8088/asterisk/rawman?action=sipeers

    but the secondo option no succes,

    navegator say: Response: Error Message: Authentication Required

    WHY???????????????

    if I’m logeado

    By juan on May 15, 2008

  15. Fantastic! I came here from the PHP documentation and this is exactly what I was looking for. And, I agree with Thijs–you should close your handles after you are done using them.

    By Joe Lencioni on May 22, 2008

  16. not nice with pages which has moved
    (301, 302 http error)

    By neor on Jun 11, 2008

  17. Hi all
    I need a quick advise: I have a program crawling many pages from the same server with many file_get_contents() calls. Do you think I would get some improvement by replacing them with just a curl_multi_exec() call to retrieve them all at once, or it’ll probably take the same time since they all come from the same server?

    By arpo on Jun 15, 2008

  18. If you have the bandwidth, it’ll be faster to use curl_multi_exec.

    By jon on Jun 18, 2008

  19. Anyone know how to catch redirect pages?

    I have problem like this:

    URL1 -> URL2 -> URL3

    -> mean redirect

    So I want to get URL3 content only with entering url 1. Anyone know How to solve this problem?

    Thx..

    By CoLiq on Jun 28, 2008

  20. Coliq, try this:

    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);

    By dan on Jul 8, 2008

  21. Just thinking out loud….
    Wouldn’t it be an idea to use some ajax to load each api data on it’s own which will mean in this case the first response would be at .1 from Google. I mean put the cURL part in a function on a different php file. call the php file in a div with the proper variables with an onload function.

    just my thoughts and nothing concrete (yet)

    By Edwin on Aug 5, 2008

  22. Yes, you can execute multiple requests through ajax, but what if you’re not using PHP through the web, or the calls you’re making require a secret api key?

    By jon on Aug 5, 2008

  23. if it’s only for your own (as i asume that’s what you mean with not thru the web) ..then i do not seed that much need to worry about about performance or neet looking ajax gif’s telling there’s something going on. I would use your way then. The ajax thought came to me as i was reading your post and thinking about ways to handle this.

    By Edwin on Aug 5, 2008

  24. Hi Jon,

    I’m trying to fill in the documentation gaps in cURL - I hope it’s okay to include something similar to this example?

    Thanks,
    Ross

    By Ross on Sep 1, 2008

  25. Sure. Feel free to link to this page as well.

    By jon on Sep 2, 2008

  26. Damn, that saved me a lot of time doing nasty API calls and writing them in a temporary file instead of polling the API directly.

    Thanks for the good work!

    By Mike Adolphs on Oct 11, 2008

  27. Thank you very much for this. Do you know how many URLs curl_multi_exec() can handle at once and not crash my application? Is it just dependent on my Memory?

    By Jeff on Oct 26, 2008

  28. Hey Jeff,
    I haven’t had a chance to try to max out curl_multi_exec. I suspect you’ll hit a bottleneck on your network before you’ll run out of memory.

    By jon on Oct 28, 2008

  29. Thanks for your reply Jon. Just like you said, I tried it with 20 URLs at once and it crashed Apache.

    By Jeff on Oct 29, 2008

  30. Jeff, there must be something whacky with your Apache installation. I’m doing something fairly similar to the above code, and when I clear the entire cache, it calls out to well over 20 URLs. I’ve yet to see Apache (2.2.6 - standard Slackware binary package) crash on it. (2.2.8 and later crash on me even without calling out to 20+ URLs, curiously. I’m pretty sure it’s just a packaging screwup, though, as I’ve tested a locally compiled 2.2.10 without hitting any crashes on cURL.)

    Cheers,
    - Dave

    By Dave on Nov 25, 2008

  31. I thought this was a great piece of code, and it was very useful. However, I think I can improve on it.
    In the original example, you showed how to reduce the wait from 0.9 to 0.5 seconds. The question arises that if 1 website is very slow, you could hold up your app for a long time. It would be great if we could work on the returned files as soon as it was returned, rather than wait for ALL files in the “multi handle”. The function (and its call) should echo out as soon as a site is downloaded, then after all pages are downloaded should return an array in the same order as the input $nodes.

    function getMultipleDocuments($nodes, $referer){
    set_time_limit(90);
    if(!$referer){
    $referer = $nodes[0];
    }
    $node_count = count($nodes);

    $curl_arr = array();
    $master = curl_multi_init();

    for($i = 0; $i < $node_count; $i++)
    {
    $curl_arr[$i] = curl_init($nodes[$i]);
    curl_setopt($curl_arr[$i], CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($curl_arr[$i],CURLOPT_FRESH_CONNECT,true);
    curl_setopt($curl_arr[$i],CURLOPT_CONNECTTIMEOUT,10);
    curl_setopt($curl_arr[$i],CURLOPT_RETURNTRANSFER,true);
    curl_setopt($curl_arr[$i],CURLOPT_REFERER,$referer);
    curl_setopt($curl_arr[$i],CURLOPT_TIMEOUT,30);

    curl_multi_add_handle($master, $curl_arr[$i]);
    }
    $previousActive = -1;
    $finalresult = array();
    $returnedOrder = array();
    do{
    curl_multi_exec($master, $running);
    if($running !== $previousActive){
    $info = curl_multi_info_read($master);
    if($info['handle']){
    $finalresult[] = curl_multi_getcontent($info['handle']);
    $returnedOrder[] = array_search($info['handle'], $curl_arr, true);
    curl_multi_remove_handle($master, $info['handle']);
    curl_close($curl_arr[end($returnedOrder)]);
    echo ‘downloaded ‘.$nodes[end($returnedOrder)].’. We can process it further straight away, but for this example, we will not.’;
    ob_flush();flush();
    }
    }
    $previousActive = $running;
    }while($running > 0);
    curl_multi_close($master);

    set_time_limit(30);
    return array_combine($returnedOrder, $finalresult);
    }

    $nodes = array(’http://mediumSpeedSite.org’, ‘http://fastSpeedSite.com’, ‘http://quiteSlowSite.com’);
    $returnedDocs = getMultipleDocuments($nodes, null);

    By Simon on Dec 15, 2008

  32. thanks for posting this, there is little to no documentation on the php.net website, this cleared things up

    By simmeh on Jan 22, 2009

  33. Thanks for sharing. I made some modifications so that each request is processed as soon as it completes. I’ve found that make things a lot faster particularly when you’re dealing with a large number of requests:

    http://onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/

    By Josh Fraser on Jan 26, 2009

  34. How does one speed up the time it takes for a CURL script to execute. Currently I am trying to execute 1 URL. I am told it is only taking .2 seconds on the other end for the server to respond but it is taking roughly 20 seconds for the CURL script to fully execute and return a response to me. Any ideas?

    By macrunnign on Feb 25, 2009

  35. What else are you doing on the page?

    The simplest way to find out what’s taking a long time is to time the various parts of the page using microtime(true). If you want to be more advanced, look into xdebug and kcachegrind (or webgrind)

    Check it out here: http://www.rustyrazorblade.com/2007/07/26/php-setting-up-xdebug-with-kcachegrind/

    By jon on Feb 25, 2009

  36. At this point I am just running the curl execution without anything else going on. It is a test page at the moment. I am trying to work out a solution for another problem. Would you happen to know how I capture the response from the other server. I am not sure as to how to load it into a variable. Currently I am getting the response but haven’t figured out what I am suppose to use to capture the other servers response.
    thanks.

    By macrunnign on Feb 25, 2009

  37. to have the results returns by curl_exec, you need to use this:

    curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);

    By jon on Feb 25, 2009

  38. I’ve got the response already I just am not sure how to load it into a variable that I can then use to manipulate the page. Are you saying $curl_handle will be my variable?
    So that if I want to create an if() statement I would use if($curle_handle = “response text”) { do this}

    ???

    By macrunnign on Feb 25, 2009

  39. You’d do this to get the response into a variable:

    $curl_handle = curl_init(”http://whatever.com”);
    curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
    $response = curl_exec($curl_handle);

    If you’re worried about timeout, you can also do:

    curl_setopt($curl_handle, CURLOPT_TIMEOUT, 2);

    Last param is seconds.

    You should get familiar with the curl options here: http://us2.php.net/manual/en/function.curl-setopt.php

    Good luck,
    Jon

    By jon on Feb 25, 2009

  40. I have tried this but I’m not sure I am getting what I need. Here is what i have so far. Maybe you can see if I messed up somewhere:
    [code]
    $myVin = $_GET['vin'];
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_URL, “http://socket.somewebsite.com:8080/?UID=C412012&REQUEST=INV&VIN=$myVin&INV_DATE=N&ONE_OWNER=Y”);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $result1 = curl_exec($ch);
    if($result1['curlopt_returntransfer'] == “Yes 1″) {
    echo “We have a Winner!”;
    } else {
    echo “Not a One Owner”;
    }
    curl_close($ch);
    [/code]

    By macrunnign on Feb 25, 2009

  41. P.S. I’m not worried about timeout. Thanks. I am more concerned with the amount of time it is taking to send and receive the response though.

    By macrunnign on Feb 25, 2009

  42. Ok. So I am confused now. I’ve got the code I mentioned above working. The problem seems to be that the variable $result1 does not want to work properly in my if statement.
    I believe it should be written so but please let me know if I am wrong.

    if($result1 == “Yes 1″) {
    echo “We have a Winner!”;
    }
    if(result1 == “Yes N”) {
    echo “Not a One Owner”;
    }

    For some reason my if statement does not appear to be working. Am I missing something on this?

    By macrunnign on Feb 25, 2009

  43. Oh yah, ive been poring over the manual on php.net as you mentioned earlier but still cannot figure out for the life of me why my variable is not being excepted. From what I gather my variable $result1 should read either “Yes N” or “Yes 1″ but when I try to pass it to my if statement the if statement does not seem to be recognizing the variable at all. I can echo $result1 onto the page but something seems amiss for the if statement not to be recognizing this.

    By macrunnign on Feb 26, 2009

  44. I tried to hit that URL, it never gave a result. The problem is the remote site. Try it in your browser first.

    By jon on Feb 26, 2009

  45. Hi Jon, I just put a dummy url in this forum for security purposes. I think I’ve got it figured out. I just wish I could find something that would execute quicker. Total time it takes for this to send, receive and execute based on the response is 20 seconds. Way to long at the moment. I need it to be 1 or 2 seconds most.

    By macrunnign on Feb 26, 2009

  46. You’re being pretty vague… I’ve suggested ways to find out what’s taking up so much time.

    I can’t really help you for free anymore - you might want to look to a user group or consultant to help you with this. Good luck.

    By jon on Feb 26, 2009

  47. oh. sorry to take up your time. I thought this was a free forum. my apologies.

    By macrunnign on Feb 26, 2009

  48. It is, but you aren’t taking the advice I’ve given to isolate the code that’s taking 20 seconds.

    By jon on Feb 26, 2009

  49. I use curl in my scripts. Your sample is great. But I got a problem: for example, on the HTML-page I have 10 links, in every link curl downloading
    http://www.example.com/?page=1 in first link
    http://www.example.com/?page=2 in second link

    http://www.example.com/?page=10 in tenth link

    If you quickly open each link (of HTML-page) in new tabs, then you’ll see the funny thing: while one tab (copy of running script) not completed - the next copy of the script will not run.
    This problem with “curl” and “multi-curl”(((

    Please help. How fix that?

    By Stern87 on Mar 8, 2009

  50. Are you sure it’s not just your browser being throttled? Try loading the URL from a second machine and see if you still get that delay.

    By jon on Mar 10, 2009

  51. I’m sure. From the second machine - it’s ok, but if you open next tab, next…. you’ll get the same funny problem.
    This is very sad.

    Here is simple example for an experiment.
    test.php:
    ===============================================
    <?
    if (isset($_GET["link"])) {
    $link = “http://nehe.gamedev.net/lesson.asp?index=0″.$_GET["link"];
    $nodes = array();
    array_push($nodes, $link);
    $node_count = count($nodes);

    $curl_arr = array();
    $master = curl_multi_init();

    for($i = 0; $i 0);

    echo “results: “;
    for($i = 0; $i
    Link 1
    Link 2
    Link 3
    Link 4
    Link 5
    Link 6
    Link 7
    Link 8
    Link 9
    ===============================================

    I’m using firefox browser, so if we will open all links in new tabs (Middle Click) rapidly - and you will see what I mean.
    While curl’s code block of first executing script will undone - next executing script will not start downloading. Why??? How can I fix it?

    Thanks again!
    I appreciate you, Jon.

    By Stern87 on Mar 13, 2009

  52. Stern87,
    I’m pretty sure your issue is related to the browser, not curl. Try running all 10 commands at the same time from 10 command lines (just use wget or curl itself) and see if you’re still throttled. If not, it could be a server issue, but might just be a network thing.

    However, it’s possible it’s an apache issue - try setting MaxSpareServers to something like 100, restart, and make sure if you do “ps aux | grep httpd” you see a ton of processes.

    You’ll also want to determine if it’s actually curl that’s causing the problem, so you could throw in a sleep(1000) in your script, and see what happens when you load the 10 links.

    Finally - and possibly more importantly, are you loading 10 asp pages on a different server? If so, everything we’ve discussed thus far has absolutely nothing to do with curl and everything to do with whatever server you’re trying to load the page of.

    By jon on Mar 20, 2009

  53. This will busy-loop, which is not good. Check out the CurlObjects implementation.


    do {
    curl_multi_exec($master,$running);
    } while($running > 0);
    —-

    By curlobjects on Mar 21, 2009

  54. I don’t see how the curl objects implementation is any better, it still has to wait on all requests to finish, if I’m reading it correctly. Just because there’s more code in the while loop doesn’t make it any better.

    By all means, correct me if I’m wrong, as I just glanced at the class.CurlBase.php file.

    By jon on Mar 22, 2009

  55. wow.. this helped me tons..

    i was only getting 20 request a minute because of latency.. but with this its not as much of an issue.

    I’m getting around 100 request give or take.

    By Cody on Mar 25, 2009

  56. hi,

    i wrote a little function which help me to get webpage content and deal with it in my program:
    [code]
    function request ($url) {
    $ch = curl_init ($url) ;
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true) ;
    $res =curl_exec ($ch) ;
    curl_close ($ch) ;
    return ($res) ;}
    [/code]
    but when i call this function a lot e.g in some loop it reduce speed limits so i’m wondering if there is a trick to speed it up

    if it’s important to talk about my program’s job, let me send it to your email

    thank you very much

    By Tariq on Jun 13, 2009

Post a Comment