Executing multiple curl requests in parallel with PHP and curl_multi_exec
February 20, 2008 – 4:17 pmLet’s get one thing out in the open. Curl is sweet. It does it’s job very well, and I’m absoutely thrilled it exists.
If you’re using curl in your PHP app to make web requests, you’ve probably realized that by doing them one after the other, the total time of your request is the sum of all the requests put together. That’s lame.
Unfortunately using the curl_multi_exec is poorly documented in the PHP manual.
Let’s say that your app is hitting APIs from these servers:
Google: .1s
Microsoft: .3s
rustyrazorblade.com: .5s
Your total time will be .9s, just for api calls.
By using curl_multi_exec, you can execute those requests in parallel, and you’ll only be limited by the slowest request, which is about .5 sec to rustyrazorblade in this case, assuming your download bandwidth is not slowing you down.
Sample code:
$nodes = array('http://www.google.com', 'http://www.microsoft.com', 'http://www.rustyrazorblade.com');
$node_count = count($nodes);
$curl_arr = array();
$master = curl_multi_init();
for($i = 0; $i < $node_count; $i++)
{
$url =$nodes[$i];
$curl_arr[$i] = curl_init($url);
curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
curl_multi_add_handle($master, $curl_arr[$i]);
}
do {
curl_multi_exec($master,$running);
} while($running > 0);
echo "results: ";
for($i = 0; $i < $node_count; $i++)
{
$results = curl_multi_getcontent ( $curl_arr[$i] );
echo( $i . "\n" . $results . "\n");
}
echo ‘done’;
It’s really not documented on php.net how to use curl_multi_getcontent, so hopefully this helps someone.



18 Responses to “Executing multiple curl requests in parallel with PHP and curl_multi_exec”
You are right, it is sad the lack of documentation about this on the php.net website. I’m glad you put a link to this guide as a comment.
By techiegroups on Feb 24, 2008
You forgot to close your handles I think….
//close the handles
inside the last for loop:
curl_multi_remove_handle($curl_arr[$i]);
after the last for loop:
curl_multi_close($master);
also see the PHP manual:
http://cn.php.net/manual/en/function.curl-multi-init.php
By Thijs (Shenzhen) on Apr 2, 2008
small mistake, it should read:
curl_multi_remove_handle($master, $curl_arr[$i]);
By Thijs (Shenzhen) on Apr 2, 2008
Hey Thijs,
I’m pretty sure it’s right as is. The for loop is to add each of the curl handles to the multihandle. If I removed them, it wouldn’t work. Try my above code, it’ll work.
By jon on Apr 3, 2008
Hi Jon,
I mean it’s better to close the handles after you use them (of course not before).
Something like this:
[code]
for($i = 0; $i < $node_count; $i++)
{
$results = curl_multi_getcontent ( $curl_arr[$i] );
echo( $i . “\n” . $results . “\n”);
url_multi_remove_handle($curl_arr[$i]);
}
url_multi_remove_handle($master, $curl_arr[$i]);
echo ‘done’;
[/code]
By Thijs (Shenzhen) on Apr 4, 2008
Hi Jon,
I mean it’s better to close the handles after you use them (of course not before).
Something like this:
[code]
for($i = 0; $i < $node_count; $i++)
{
$results = curl_multi_getcontent ( $curl_arr[$i] );
echo( $i . “\n” . $results . “\n”);
url_multi_remove_handle($master, $curl_arr[$i]);
}
curl_multi_close($master);
echo ‘done’;
[/code]
ps. please disregard the previous comment from me. Difficult to copy/paste code here
By Thijs (Shenzhen) on Apr 4, 2008
hi, quick question: what would be the most efficient way to use cURL to grab a page AND the header, but only display the page to the users browser. i would ideally like to do this by only opening one cURL session. currently i have the following code, which, as you can see, is intended to pass all $_GET and $_POST info, but currently has no mechanism to keep track of cookies:
$target_domain = ‘http://targetdomain.com/fowlder/page.cfm’;
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, “{$target_domain}?{$_SERVER["QUERY_STRING"]}”);
curl_setopt ($ch, CURLOPT_HEADER, true); // this displays header info on users browser but i really want to just load it into variables
curl_setopt ($ch, CURLOPT_POST, true);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $_POST);
curl_exec ($ch);
curl_close ($ch);
By sean on Apr 14, 2008
Hey Sean,
I poked around the PHP site, and there’s actually someone who wrote a function to do this.
Give it a try, please let me know if if works.
http://us3.php.net/manual/en/function.curl-setopt.php#42009
By jon on Apr 15, 2008
All you need is this part, btw:
list($response_headers,$response_body) = explode(”\r\n\r\n”,$r,2);
Enjoy.
By jon on Apr 15, 2008
helo,
I need an advice please:
i try to connect to a website, using curl, but its require javascript and i am not able to see what i need. there is exist other solution to this ?
regards
By george on Apr 15, 2008
You’ll have to figure out which javascript file you’ll need, then run your code though a javascript interpreter. To be honest, I’ve never done it, and I have no idea how good they are.
Here’s 1 project:
http://j4p5.sourceforge.net/
By jon on Apr 15, 2008
@george:
In the past, when faced with that problem, I ended up writing a few regular expressions to get the data I needed directly out of the javascript files.
Depending on what you’re trying to achieve, you may find this type of approach simplest.
By frank on Apr 24, 2008
mi problem is:
first option is login whith => http://192.168.1.100:8088/asterisk/manager?action=login&username=admin&secret=123
and second option is => http://192.168.1.100:8088/asterisk/rawman?action=sipeers
but the secondo option no succes,
navegator say: Response: Error Message: Authentication Required
WHY???????????????
if I’m logeado
By juan on May 15, 2008
Fantastic! I came here from the PHP documentation and this is exactly what I was looking for. And, I agree with Thijs–you should close your handles after you are done using them.
By Joe Lencioni on May 22, 2008
not nice with pages which has moved
(301, 302 http error)
By neor on Jun 11, 2008
Hi all
I need a quick advise: I have a program crawling many pages from the same server with many file_get_contents() calls. Do you think I would get some improvement by replacing them with just a curl_multi_exec() call to retrieve them all at once, or it’ll probably take the same time since they all come from the same server?
By arpo on Jun 15, 2008
If you have the bandwidth, it’ll be faster to use curl_multi_exec.
By jon on Jun 18, 2008
Anyone know how to catch redirect pages?
I have problem like this:
URL1 -> URL2 -> URL3
-> mean redirect
So I want to get URL3 content only with entering url 1. Anyone know How to solve this problem?
Thx..
By CoLiq on Jun 28, 2008