python簡單爬數據
阿新 • • 發佈:2017-06-05
import agen model include urlencode port horizon 如果 nec
失敗了,即使跟Firefox看到的headers,參數一模一樣都不行,爬出來有網頁,但是就是不給數據,嘗試禁用了js,然後看到了cookie(不禁用js是沒有cookie的),用這個cookie爬,還是不行,隔了時間再看,cookie的內容也並沒有變化,有點受挫,但還是發出來,也算給自己留個小任務啥的
如果有大佬經過,還望不吝賜教
另外另兩個網站的腳本都可以用,過會直接放下代碼,過程就不說了
目標網站 http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml
先解決一下date到decimal years的轉換,僅考慮到天的粗略轉換
def date2dy(year, month, day): months = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] oneyear = 365 if year%100 == 0: if year%400 == 0: months[1] = 29 oneyear = 366 else: if year%4 == 0: months[1] = 29 oneyear = 366 days= 0 i = 1 while i < month: days = days + months[i] i = i + 1 days = days + day - 1 return year + days/366
第一個小目標是抓下2016.12.1的數據
打開FireFox的F12,調到網絡一欄
提交數據得到
有用的信息是請求頭,請求網址和參數,扒下來扔到程序裏面試試
這塊我試了大概一天多,抓不下來,我好菜呀.jpg
放下代碼吧先,萬一有大佬經過還望不吝賜教
#!usr/bin/python import requestsimport sys web_url = r‘http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml‘ request_url = r‘http://www.geomag.bgs.ac.uk/cgi-bin/igrfsynth‘ filepath = sys.path[0] + ‘\\data_igrf_raw_‘ + ‘.html‘ fid = open(filepath, ‘w‘, encoding=‘utf-8‘) headers = { ‘Host‘: ‘www.geomag.bgs.ac.uk‘, ‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 6.1; rv:53.0) Gecko/20100101 Firefox/53.0‘, ‘Accept‘: ‘text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8‘, ‘Accept-Language‘: ‘zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3‘, ‘Accept-Encoding‘: ‘gzip, deflate‘, ‘Content-Type‘: ‘application/x-www-form-urlencoded‘, ‘Content-Length‘: ‘136‘, ‘Referer‘: ‘http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml‘, ‘Connection‘: ‘keep-alive‘, ‘Upgrade-Insecure-Requests‘: ‘1‘ } payload = { ‘name‘: ‘-‘, # your name and email address ‘coord‘: ‘1‘, # ‘1‘: Geodetic ‘2‘: Geocentic ‘date‘: ‘2016.92‘, # decimal years ‘alt‘: ‘150‘, # Altitude ‘place‘: ‘‘, ‘degmin‘: ‘y‘, # Position Coordinates: ‘y‘: In Degrees and Minutes ‘n‘: In Decimal Degrees ‘latd‘: ‘60‘, # latitude degrees (degrees negative for south) ‘latm‘: ‘0‘, # latitude minutes ‘lond‘: ‘120‘, # longitude degrees (degrees negative for west) ‘lonm‘: ‘0‘, # longitude minutes ‘tot‘: ‘y‘, # Total Intensity(F) ‘dec‘: ‘y‘, # Declination(D) ‘inc‘: ‘y‘, # Inclination(I) ‘hor‘: ‘y‘, # Horizontal Intensity(H) ‘nor‘: ‘y‘, # North Component (X) ‘eas‘: ‘y‘, # East Component (Y) ‘ver‘: ‘y‘, # Vertical Component (Z) ‘map‘: ‘0‘, # Include a Map of the Location: ‘0‘: NO ‘1‘: YES ‘sv‘: ‘n‘ } #如果需要Secular Variation (rate of change), 加上‘sv‘: ‘y‘ r = requests.post(request_url, data=payload, headers=headers) fid.write(r.text) fid.close();
python簡單爬數據