[PHP]用PHP自己寫一個基於zoomeye的api(偷懶必備quq)
0x01 起因
因為手速慢,漏洞刷不過別人,一個個手補確實慢,所以想自己寫一個api,一鍵抓取zoomeye的20頁,然後就可以打批量了 ovo(真是太妙了!)
0x02 動工
1.抓包做準備
要做一個抓取的,當然是先抓包啦~
拿出我的bp~
先輸入一個關鍵字,方便在bp裡面找我輸入的關鍵字
然後回車~
發現我輸入的關鍵字在這個GET的請求包裡面,但是他到底是不是呢?
傳送到repeat模組看看唄
2.使用php的curl來模擬訪問
PHP支援的由Daniel Stenberg建立的libcurl庫允許你與各種的伺服器使用各種型別的協議進行連線和通訊。
libcurl目前支援http、https、ftp、gopher、telnet、dict、file和ldap協議。libcurl同時也支援HTTPS認證、HTTP POST、HTTP PUT、 FTP 上傳(這個也能通過PHP的FTP擴充套件完成)、HTTP 基於表單的上傳、代理、cookies和使用者名稱+密碼的認證。
PHP中使用cURL實現Get和Post請求的方法
這些函式在PHP 4.0.2中被引入。
就是說,在php4.0.2中就已經引入了curl,而且還可以做post和get,真是太有用了有木有
拿出我剛剛記錄好的請求包~
GET /search?q=keywords&p=1 HTTP/1.1
Host: www.zoomeye.org
Connection: close
Accept: application/json, text/plain, */*
Sec-Fetch-Dest: empty
Cube-Authorization: eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6I**tVkRTd29sX0d2cXo4clFYX1VkZ3ExZUV3Y2MiLCJlbWFpbCI6IjEyMDU4NjY5ODVAcXEuY29tIiwiZXhwIjoxNTg5MDc5MzA3LjB9.Vj0nd-tC3Z8FIg0TvBuNgsoksv4RtS9ryDaTr5TDYa0
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: cors
Referer: https://www.zoomeye.org/searchResult?q=keywords
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9
Cookie: __root_domain_v=.zoomeye.org; _qddaz=QD.hhi2ek.7ofq41.k9nl84mk; __jsluid_s=68ead3868c48be189ad9a36aedae89b2; Hm_lvt_3c8266fabffc08ed4774a252adcb9263=1588484284,1588486025,1588992857,1588992907; _qddab=3-qc83zy.k9z1clvv; __jsl_clearance=1589003429.168|0|QzmwnseUa6LsD9SPada9A%2F68MUg%3D; Hm_lpvt_3c8266fabffc08ed4774a252adcb9263=1589003970
然後用php語言來描述他:
<?php
function curl_post($url) {
/*-----------------SET COOKIE-------*/
$cookies=' __root_domain_v=.zoomeye.org; _qddaz=QD.hhi2ek.7ofq41.k9nl84mk; __jsluid_s=68ead3868c48be189ad9a36aedae89b2; Hm_lvt_3c8266fabffc08ed4774a252adcb9263=1588484284,1588486025,1588992857,1588992907; _qddab=3-qc83zy.k9z1clvv; __jsl_clearance=1588999664.016|0|HlMEMiGt3peQ%2FyF5pwOoAVi7Hhg%3D; Hm_lpvt_3c8266fabffc08ed4774a252adcb9263=1588999939';
/*-----------------SET COOKIE---------*/
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
$headers = array();
$headers[] = 'Host:www.zoomeye.org';
$headers[] = 'Connection: close';
$headers[] = 'Accept: application/json, text/plain, */*';
$headers[] = 'Sec-Fetch-Dest: empty';
/**/
$headers[] = 'Cube-Authorization: eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6I**tVkRTd29sX0d2cXo4clFYX1VkZ3ExZUV3Y2MiLCJlbWFpbCI6IjEyMDU4NjY5ODVAcXEuY29tIiwiZXhwIjoxNTg5MDc5MzA3LjB9.Vj0nd-tC3Z8FIg0TvBuNgsoksv4RtS9ryDaTr5TDYa0';
$headers[] = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36';
$headers[] = 'Sec-Fetch-Site: same-origin';
$headers[] = 'Sec-Fetch-Mode: cors';
$headers[] = 'Referer: https://www.zoomeye.org/searchResult?q=%22Office%20Anywhere%202017%22';
$headers[] = 'Accept-Language: zh-CN,zh;q=0.9';
$headers[] = 'Cookie: '.$cookies;
$headers[] = 'If-None-Match: W/"3828048cfa646c65b99b190eb8c4418ee44f4da2"';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$output= curl_exec($ch);
curl_close($ch);
return $output;
}
$a=curl_post('https://www.zoomeye.org/search?q=keywords&p=1');
vardump($a);
?>
這樣子就可以完成一次php中curl對zoomeye的請求了
curl_setopt($ch, CURLOPT_URL, $url);
這句是設定請求的url
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
這兩句是忽略ssl證書
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
這句是確定返回形式 差不多就是0為直接列印螢幕,1為返回到具體變數裡面
curl_setopt($ch, CURLOPT_HEADER, TRUE);
保留head
最後就會把返回值$a列印在螢幕上了
來看看效果
可能有的小夥伴會問 可是返回的是這個啊
別急啊,右鍵去看看原始碼
這時候就和之前的用bp的返回的值一模一樣了
只要取出ip +port就行了 大功告成
這裡有兩種方法:
1.先從請求包裡面取出json陣列進行解析
2.直接用正則全域性匹配
我用的是第二種方法
因為我這邊json解析老是出問題,所以用了正則:
$pattern = '/"ip": "(.*?)"(.*?)", "geoinfo(.*?)/i';
preg_match_all($pattern, $a, $match);
這樣子就能把之前的a中的ip取出來了
然後只要再取一次port,把兩個進行拼接就行了
$patternone = '/"port":(.*?)(.*?), "service"(.*?)/i';
preg_match_all($patternone, $a, $match1);
但是其實這樣子取出來的,還是port":xxx ,"service"的形式,所以用取文字的函式進行二次過濾:
function getSubstr($str, $leftStr, $rightStr) {
$left = strpos($str, $leftStr);
$right = strpos($str, $rightStr,$left);
if($left < 0 or $right < $left) return '';
return substr($str, $left + strlen($leftStr), $right-$left-strlen($leftStr));
}
然後只要getSunstr取出來就行了,但是正則以後的資料是存放在match數組裡面的,所以用for迴圈來迴圈取出放到新陣列port裡面
for ($i=0;$i<=count($match1[0]);$i++) {
$port[$i]=getSubstr($match1[0][$i],'"port": ',', "service');
}
然後進行拼接:
for ($i=0;$i<=$ccc;$i++) {
$url[$i]=$match[1][$i];
if(checkIp($url[$i])) {
echo addslashes($url[$i].':'.$port[$i].'</p>');
}
}
checkip是檢查ip那個陣列的ip合法性
不然會有奇怪的東西跟進來
0x03所以完整程式碼如下:
<?php
function getSubstr($str, $leftStr, $rightStr) {
$left = strpos($str, $leftStr);
$right = strpos($str, $rightStr,$left);
if($left < 0 or $right < $left) return '';
return substr($str, $left + strlen($leftStr), $right-$left-strlen($leftStr));
}
function checkIp($ip) {
$arr = explode('.',$ip);
if(count($arr) != 4) {
return false;
} else {
for ($i = 0;$i < 4;$i++) {
if(($arr[$i] <'0') || ($arr[$i] > '255')) {
return false;
}
}
}
return true;
}
function curl_post($url) {
/*-----------------SET COOKIE-------*/
$cookies=' __root_domain_v=.zoomeye.org; _qddaz=QD.hhi2ek.7ofq41.k9nl84mk; __jsluid_s=68ead3868c48be189ad9a36aedae89b2; Hm_lvt_3c8266fabffc08ed4774a252adcb9263=1588484284,1588486025,1588992857,1588992907; _qddab=3-qc83zy.k9z1clvv; __jsl_clearance=1588999664.016|0|HlMEMiGt3peQ%2FyF5pwOoAVi7Hhg%3D; Hm_lpvt_3c8266fabffc08ed4774a252adcb9263=1588999939';
/*-----------------SET COOKIE---------*/
/*---------------set ca-------------*/
$ca='eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6I**tVkRTd29sX0d2cXo4clFYX1VkZ3ExZUV3Y2MiLCJlbWFpbCI6IjEyMDU4NjY5ODVAcXEuY29tIiwiZXhwIjoxNTg5MDc5MzA3LjB9.Vj0nd-tC3Z8FIg0TvBuNgsoksv4RtS9ryDaTr5TDYa0';
/*----------------end ---set-------------*/
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
$headers = array();
$headers[] = 'Host:www.zoomeye.org';
$headers[] = 'Connection: close';
$headers[] = 'Accept: application/json, text/plain, */*';
$headers[] = 'Sec-Fetch-Dest: empty';
/**/
$headers[] = 'Cube-Authorization: '.$ca;
$headers[] = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36';
$headers[] = 'Sec-Fetch-Site: same-origin';
$headers[] = 'Sec-Fetch-Mode: cors';
$headers[] = 'Referer: https://www.zoomeye.org/searchResult?q=%22Office%20Anywhere%202017%22';
$headers[] = 'Accept-Language: zh-CN,zh;q=0.9';
$headers[] = 'Cookie: '.$cookies;
$headers[] = 'If-None-Match: W/"3828048cfa646c65b99b190eb8c4418ee44f4da2"';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$output= curl_exec($ch);
curl_close($ch);
return $output;
}
function get($page) {
$flag=false;
/*----------------SET KEYWORDS-------------*/
$keywords='"phpStudy%20探針%202014%20"';
/*-----------------\SET KEY WORDS!/-------------*/
$store=array();
$a=curl_post('https://www.zoomeye.org/search?q='.$keywords.'&p='.$page);
$status=getSubstr($a,'{"status": ',', "matches"');
if ($status!=200) {
echo $status.'</p>'.'爬取頻繁或cookie過期,請重新回到zoomeye輸入驗證碼/重新獲取cookie後繼續爬取:<a href="https://www.zoomeye.org/searchResult?q='.$keywords.'">回到zoomeye</a>'.PHP_EOL;
return true;
} else {
$pattern = '/"ip": "(.*?)"(.*?)", "geoinfo(.*?)/i';
preg_match_all($pattern, $a, $match);
$patternone = '/"port":(.*?)(.*?), "service"(.*?)/i';
preg_match_all($patternone, $a, $match1);
$port=array();
for ($i=0;$i<=count($match1[0]);$i++) {
$port[$i]=getSubstr($match1[0][$i],'"port": ',', "service');
}
$ccc=count($match[1]);
for ($i=0;$i<=$ccc;$i++) {
$url[$i]=$match[1][$i];
if(checkIp($url[$i])) {
echo addslashes($url[$i].':'.$port[$i].'</p>');
}
}
}
}
for ($i=1;$i<=20;$i++) {
sleep(2);
$flag=get($i);
if($flag) {
break;
}
}
?>
keywords,cookies,Cube-Authorization都要根據自己的zoomeye的請求包修改,因為我太菜了,不會做
最後附上成果:
&n