PHP的curl常用的5個(gè)例子
<?php $ch= curl_init(); curl_setopt($ch, CURLOPT_URL,'http://localhost/mytest/phpinfo.php'); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//如果把這行注釋掉的話,就會(huì)直接輸出 $result=curl_exec($ch); curl_close($ch); ?> 2,使用代理進(jìn)行抓取 為什么要使用代理進(jìn)行抓取呢?以google為例吧,如果去抓google的數(shù)據(jù),短時(shí)間內(nèi)抓的很頻繁的話,你就抓取不到了。google對(duì)你的ip地址做限制這個(gè)時(shí)候,你可以換代理重新抓。
<?php $ch= curl_init(); curl_setopt($ch, CURLOPT_URL,'http://blog.51yip.com'); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE); curl_setopt($ch, CURLOPT_PROXY, 125.21.23.6:8080); //url_setopt($ch, CURLOPT_PROXYUSERPWD, ’user:password’);如果要密碼的話,加上這個(gè) $result=curl_exec($ch); curl_close($ch); ?> 3,post數(shù)據(jù)后,抓取數(shù)據(jù) 單獨(dú)說(shuō)一下數(shù)據(jù)提交數(shù)據(jù),因?yàn)橛?curl的時(shí)候,很多時(shí)候會(huì)有數(shù)據(jù)交互的,所以比較重要的。
<?php $ch= curl_init(); /*在這里需要注意的是,要提交的數(shù)據(jù)不能是二維數(shù)組或者更高 *例如array(’name’=>serialize(array(’tank’,’zhang’)),’sex’=>1,’birth’=>’20101010’) *例如array(’name’=>array(’tank’,’zhang’),’sex’=>1,’birth’=>’20101010’)這樣會(huì)報(bào)錯(cuò)的*/ $data=array(’name’=>’test’,’sex’=>1,’birth’=>’20101010’); curl_setopt($ch, CURLOPT_URL,’http://localhost/mytest/curl/upload.php’); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS,$data); curl_exec($ch); ?> 在 upload.php文件中,print_r($_POST);利用curl就能抓取出upload.php輸出的內(nèi)容Array ( [name] => test [sex] => 1 [birth] => 20101010 ) 4,抓取一些有頁(yè)面訪問(wèn)控制的頁(yè)面 以前寫(xiě)過(guò)一篇,頁(yè)面訪問(wèn)控制的3種方法有興趣的可以看一下。 如果用上面提到的方法抓的話,會(huì)報(bào)以下錯(cuò)誤 You are not authorized to view this pageYoudonot have permission to view this directoryorpage using the credentials that you supplied because your Web browser is sending a WWW-Authenticate header field that the Web server is not configured to accept. 這個(gè)時(shí)候,我們就要用CURLOPT_USERPWD來(lái)進(jìn)行驗(yàn)證了
<?php $ch= curl_init(); curl_setopt($ch, CURLOPT_URL,'http://club-china'); /*CURLOPT_USERPWD主要用來(lái)破解頁(yè)面訪問(wèn)控制的 *例如平時(shí)我們所以htpasswd產(chǎn)生頁(yè)面控制等。*/ //curl_setopt($ch, CURLOPT_USERPWD, ’231144:2091XTAjmd=’); curl_setopt($ch, CURLOPT_HTTPGET, 1); curl_setopt($ch, CURLOPT_REFERER,'http://club-china'); curl_setopt($ch, CURLOPT_HEADER, 0); $result=curl_exec($ch); curl_close($ch); ?> 5,模擬登錄到sina 我們要抓取數(shù)據(jù),可能是登錄以后的內(nèi)容,這個(gè)時(shí)候我們就要用到curl的模擬登錄功能了。
<?php functionchecklogin($user,$password) { if( emptyempty($user) || emptyempty($password) ) { return0; } $ch= curl_init( ); curl_setopt($ch, CURLOPT_REFERER,'http://mail.sina.com.cn/index.html'); curl_setopt($ch, CURLOPT_HEADER, true ); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true ); curl_setopt($ch, CURLOPT_USERAGENT, USERAGENT ); curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIEJAR ); curl_setopt($ch, CURLOPT_TIMEOUT, TIMEOUT ); curl_setopt($ch, CURLOPT_URL,'http://mail.sina.com.cn/cgi-bin/login.cgi'); curl_setopt($ch, CURLOPT_POST, true ); curl_setopt($ch, CURLOPT_POSTFIELDS,'&logintype=uid&u='.urlencode($user).'&psw='.$password); $contents= curl_exec($ch); curl_close($ch); if( !preg_match('/Location: (.*)/cgi/index.php?check_time=(.*)n/',$contents,$matches) ) { return0; }else{ return1; } } define('USERAGENT',$_SERVER[’HTTP_USER_AGENT’] ); define('COOKIEJAR', tempnam('/tmp','cookie') ); define('TIMEOUT', 500 ); echochecklogin('zhangying215','xtaj227'); ?> 打開(kāi)/tmp下面的cookie文件看一下 # Netscape HTTP Cookie File# http://curl.haxx.se/rfc/cookie_spec.html# This file was generated by libcurl! Edit at your own risk. mail.sina.com.cn FALSE / FALSE 0 SINAMAIL-WEBFACE-SESSID 65223c4bd8900284ed463d2a3e1ac182#HttpOnly_.sina.com.cn TRUE / FALSE 0 SUE es%3D8d96db0820c6c79922ad57d422f575e8%26ev%3Dv0%26es2%3Dcddfb8400dc5ca95902367ddcd7f57dd.sina.com.cn TRUE / FALSE 0 SUP cv%3D1%26bt%3D1286900433%26et%3D1286986833%26lt%3D1%26uid%3D1445632344%26user%3D%25E5%25BC%25A0%25E6%2598%25A02001%26ag%3D2%26name%3Dzhangying20015%2540sina.com%26nick%3D%25E5%25BC%25A0%25E6%2598%25A02001%26sex%3D1%26ps%3D0%26email%3Dzhangying20015%2540sina.com%26dob%3D1982-07-18#HttpOnly_.sina.com.cn TRUE / FALSE 0 SID BihcallomxMx-QZxzGrOlcSQx%2F0B%2F0cmr.NyQ%2F0B%2FcmGGalmarlmcHrcGlSmrmxmfxal_CBZ%2F_afugCmmGirBYHm0Bc%40fr5ciZiGG5i#HttpOnly_.sina.com.cn TRUE / FALSE 0 SPRIAL bfb4102951fd5892a3fd5b42d442cd26#HttpOnly_.sina.com.cn TRUE / FALSE 0 SINA_USER %D5%C5%D2001
相關(guān)文章:
1. jsp+servlet實(shí)現(xiàn)猜數(shù)字游戲2. JSP+Servlet實(shí)現(xiàn)文件上傳到服務(wù)器功能3. CSS可以做的幾個(gè)令你嘆為觀止的實(shí)例分享4. 將properties文件的配置設(shè)置為整個(gè)Web應(yīng)用的全局變量實(shí)現(xiàn)方法5. 低版本IE正常運(yùn)行HTML5+CSS3網(wǎng)站的3種解決方案6. JSP之表單提交get和post的區(qū)別詳解及實(shí)例7. UDDI FAQs8. Xml簡(jiǎn)介_(kāi)動(dòng)力節(jié)點(diǎn)Java學(xué)院整理9. jsp文件下載功能實(shí)現(xiàn)代碼10. ASP常用日期格式化函數(shù) FormatDate()
