马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有帐号?立即注册
x
exit来实现结束后面的PHP语句的执行,缩小调试范围,特别是数据库交互的程序,先输出个SQL语句看看,对了,再分析怎么会插入/删除不成功呢?这样对查错很有帮助。 在实践项目或本人编写小东西(好比旧事聚合,商品价钱监控,比价)的过程当中, 凡是需求从第3方网站或API接口获得数据, 在需求处置1个URL队列时, 为了进步功能, 可以采取cURL供应的curl_multi_*族函数完成复杂的并发.
本文将切磋两种详细的完成办法, 并对分歧的办法做复杂的功能对照.
1. 经典cURL并发机制及其存在的成绩
经典的cURL完成机制在网上很轻易找到, 好比参考PHP在线手册的以下完成体例:
function classic_curl($urls, $delay) {
$queue = curl_multi_init();
$map = array();
foreach ($urls as $url) {
// create cURL resources
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_TIMEOUT, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_NOSIGNAL, true);
// add handle
curl_multi_add_handle($queue, $ch);
$map[$url] = $ch;
}
$active = null;
// execute the handles
do {
$mrc = curl_multi_exec($queue, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
while ($active > 0 && $mrc == CURLM_OK) {
if (curl_multi_select($queue, 0.5) != -1) {
do {
$mrc = curl_multi_exec($queue, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}
}
$responses = array();
foreach ($map as $url=>$ch) {
$responses[$url] = callback(curl_multi_getcontent($ch), $delay);
curl_multi_remove_handle($queue, $ch);
curl_close($ch);
}
curl_multi_close($queue);
return $responses;
}
起首将一切的URL压入并发队列, 然后履行并发进程, 守候一切恳求吸收完以后停止数据的解析等后续处置. 在实践的处置过程当中, 受收集传输的影响, 局部URL的内容会优先于其他URL前往, 然而经典cURL并发必需守候最慢的谁人URL前往以后才入手下手处置, 守候也就意味着CPU的余暇和华侈. 假如URL队列很短, 这类余暇和华侈还处在可承受的局限, 但假如队列很长, 这类守候和华侈将变得不成承受.
2. 改善的Rolling cURL并发体例
细心剖析不难发明经典cURL并发回存在优化的空间, 优化的体例时当某个URL恳求终了以后尽量快的去向理它, 边处置边守候其他的URL前往, 而不是守候谁人最慢的接口前往以后才入手下手处置等任务, 从而防止CPU的余暇和华侈. 闲话不多说, 上面贴上详细的完成:
function rolling_curl($urls, $delay) {
$queue = curl_multi_init();
$map = array();
foreach ($urls as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_TIMEOUT, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_NOSIGNAL, true);
curl_multi_add_handle($queue, $ch);
$map[(string) $ch] = $url;
}
$responses = array();
do {
while (($code = curl_multi_exec($queue, $active)) == CURLM_CALL_MULTI_PERFORM) ;
if ($code != CURLM_OK) { break; }
// a request was just completed -- find out which one
while ($done = curl_multi_info_read($queue)) {
// get the info and content returned on the request
$info = curl_getinfo($done['handle']);
$error = curl_error($done['handle']);
$results = callback(curl_multi_getcontent($done['handle']), $delay);
$responses[$map[(string) $done['handle']]] = compact('info', 'error', 'results');
// remove the curl handle that just completed
curl_multi_remove_handle($queue, $done['handle']);
curl_close($done['handle']);
}
// Block for data in / output; error handling is done by curl_multi_exec
if ($active > 0) {
curl_multi_select($queue, 0.5);
}
} while ($active);
curl_multi_close($queue);
return $responses;
}
3. 两种并发完成的功能对照
改善前后的功能对照实验在LINUX主机长进行, 测试时利用的并发队列以下:
http://item.taobao.com/item.htm?id=14392877692
http://item.taobao.com/item.htm?id=16231676302
http://item.taobao.com/item.htm?id=17037160462
http://item.taobao.com/item.htm?id=5522416710
http://item.taobao.com/item.htm?id=16551116403
http://item.taobao.com/item.htm?id=14088310973
扼要申明下实行设计的准绳和功能测试了局的格局: 为包管了局的牢靠, 每组实行反复20次, 在单次实行中, 给定不异的接口URL纠合, 分离丈量Classic(指经典的并发机制)和Rolling(指改善后的并发机制)两种并发机制的耗时(秒为单元), 耗时短者胜出(Winner), 并盘算节俭的工夫(Excellence, 秒为单元)和功能提拔比例(Excel. %). 为了尽可能切近真实的恳求而又坚持实行的复杂, 在对前往了局的处置上只是做了复杂的正则表达式婚配, 而没有停止其他庞杂的操作. 别的, 为了肯定了局处置回调对功能对照测试了局的影响, 可使用usleep摹拟实际中对照担任的数据处置逻辑(如提取, 分词, 写入文件或数据库等).
功能测试顶用到的回调函数为:
function callback($data, $delay) {
preg_match_all('/<h3>(.+)<\/h3>/iU', $data, $matches);
usleep($delay);
return compact('data', 'matches');
}
数据处置回调无延迟时: Rolling Curl略优, 但功能提拔后果不分明.
- ------------------------------------------------------------------------------------------------ Delay: 0 micro seconds, equals to 0 milli seconds ------------------------------------------------------------------------------------------------ Counter Classic Rolling Winner Excellence Excel. % ------------------------------------------------------------------------------------------------ 1 0.1193 0.0390 Rolling 0.0803 67.31% 2 0.0556 0.0477 Rolling 0.0079 14.21% 3 0.0461 0.0588 Classic -0.0127 -21.6% 4 0.0464 0.0385 Rolling 0.0079 17.03% 5 0.0534 0.0448 Rolling 0.0086 16.1% 6 0.0540 0.0714 Classic -0.0174 -24.37% 7 0.0386 0.0416 Classic -0.0030 -7.21% 8 0.0357 0.0398 Classic -0.0041 -10.3% 9 0.0437 0.0442 Classic -0.0005 -1.13% 10 0.0319 0.0348 Classic -0.0029 -8.33% 11 0.0529 0.0430 Rolling 0.0099 18.71% 12 0.0503 0.0581 Classic -0.0078 -13.43% 13 0.0344 0.0225 Rolling 0.0119 34.59% 14 0.0397 0.0643 Classic -0.0246 -38.26% 15 0.0368 0.0489 Classic -0.0121 -24.74% 16 0.0502 0.0394 Rolling 0.0108 21.51% 17 0.0592 0.0383 Rolling 0.0209 35.3% 18 0.0302 0.0285 Rolling 0.0017 5.63% 19 0.0248 0.0553 Classic -0.0305 -55.15% 20 0.0137 0.0131 Rolling 0.0006 4.38% ------------------------------------------------------------------------------------------------ Average 0.0458 0.0436 Rolling 0.0022 4.8% ------------------------------------------------------------------------------------------------ Summary: Classic wins 10 times, while Rolling wins 10 times
复制代码
数据处置回调延迟5毫秒: Rolling Curl完胜, 功能提拔40%摆布.
- ------------------------------------------------------------------------------------------------ Delay: 5000 micro seconds, equals to 5 milli seconds ------------------------------------------------------------------------------------------------ Counter Classic Rolling Winner Excellence Excel. % ------------------------------------------------------------------------------------------------ 1 0.0658 0.0352 Rolling 0.0306 46.5% 2 0.0728 0.0367 Rolling 0.0361 49.59% 3 0.0732 0.0387 Rolling 0.0345 47.13% 4 0.0783 0.0347 Rolling 0.0436 55.68% 5 0.0658 0.0286 Rolling 0.0372 56.53% 6 0.0687 0.0362 Rolling 0.0325 47.31% 7 0.0787 0.0337 Rolling 0.0450 57.18% 8 0.0676 0.0391 Rolling 0.0285 42.16% 9 0.0668 0.0351 Rolling 0.0317 47.46% 10 0.0603 0.0317 Rolling 0.0286 47.43% 11 0.0714 0.0350 Rolling 0.0364 50.98% 12 0.0627 0.0215 Rolling 0.0412 65.71% 13 0.0617 0.0401 Rolling 0.0216 35.01% 14 0.0721 0.0226 Rolling 0.0495 68.65% 15 0.0701 0.0428 Rolling 0.0273 38.94% 16 0.0674 0.0352 Rolling 0.0322 47.77% 17 0.0452 0.0425 Rolling 0.0027 5.97% 18 0.0596 0.0366 Rolling 0.0230 38.59% 19 0.0679 0.0480 Rolling 0.0199 29.31% 20 0.0657 0.0338 Rolling 0.0319 48.55% ------------------------------------------------------------------------------------------------ Average 0.0671 0.0354 Rolling 0.0317 47.24% ------------------------------------------------------------------------------------------------ Summary: Classic wins 0 times, while Rolling wins 20 times
复制代码 经由过程下面的功能对照, 在处置URL队列并发的使用场景中Rolling cURL应当是加倍的选择, 并发量十分大(1000+)时, 可以掌握并发队列的最大长度, 好比20, 每当1个URL前往并处置终了以后当即到场1个还没有恳求的URL到队列中, 如许写出来的代码会加倍强健, 不至于并发数太大而卡逝世或溃散. 具体的完成请参考: http://code.谷歌.com/p/rolling-curl/
不懂的问题有很多高手帮你解决。但不要认为你是新手,就不能帮助别人,比如今天你学会了怎样安装PHP,明天还可能有朋友会问这个问题,你就可以给他解答,不要认为这是浪费时间,忙别人其实就是帮助自己。 |