Hi Jerry,
Sorry for the trouble with this. We are currently blocking the Baidu user agent from crawling GitHub Pages sites in response to this user agent being responsible for an excessive amount of requests, which was causing availability issues for other GitHub customers.
This is unlikely to change any time soon, so if you need the Baidu user agent to be able to crawl your site you will need to host it elsewhere.
Apologies again for the inconvenience.
Cheers,Alex
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
- 所有抓取成功的页面都访问了 209.9.130.5 节点
- 所有抓取失败的页面都访问了 209.9.130.6 节点
- 我本机ping jerryzou.com会 ping 到 209.9.130.8 节点
复制代码
- <?phpfunction Curl($url, $ip){ $ch = curl_init(); curl_setopt_array($ch, [ CURLOPT_URL => $url, CURLOPT_TIMEOUT => 10, CURLOPT_HEADER => true, CURLOPT_HTTPHEADER => [ 'X-FORWARDED-FOR: '.$ip, 'CLIENT-IP: '.$ip ], CURLOPT_RETURNTRANSFER => true, CURLOPT_FOLLOWLOCATION => true, CURLOPT_NOBODY => false, CURLOPT_REFERER => 'http://test.jerryzou.com', CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36', ]); $response = curl_exec($ch); curl_close($ch); return $response;}$ipList = [ '203.125.234.1', '220.181.7.1', '123.125.66.1', '123.125.71.1', '119.63.192.1', '119.63.193.1', '119.63.194.1', '119.63.195.1', '119.63.196.1', '119.63.197.1', '119.63.198.1', '119.63.199.1', '180.76.5.1', '202.108.249.185', '202.108.249.177', '202.108.249.182', '202.108.249.184', '202.108.249.189', '61.135.146.200', '61.135.145.221', '61.135.145.207', '202.108.250.196', '68.170.119.76', '207.46.199.52',];foreach ($ipList as $ip) { Curl('http://jerryzou.com', $ip); echo "$ip\n";}echo "Done\n";
欢迎光临 冠富商务通中文社区 (http://gu1vhwx.nat.ipyingshe.com/news/) | Powered by Discuz! 3.0 |