Python——使用高德API獲取POI(以深圳南山醫療保健服務POI為例)
阿新 • • 發佈:2017-09-30
tel range cnblogs 類別 ice index arch 獲取網頁 pla
以下內容為原創,轉載請註明出處。
1 import xlwt #創建Excel,見代碼行8,9,11,25,28;CMD下:運行pip install xlwt進行安裝 2 import urllib.request # url請求,Python3自帶,Python2與3中urllib的區別見:http://blog.csdn.net/Jurbo/article/details/52313636 3 from bs4 import BeautifulSoup # 快速獲取網頁標簽內容的庫;CMD下:運行pip install beautifulsoup4進行安裝 4 import re # 使用正則表達式的庫,代碼行7,快速學習見:http://www.runoob.com/regexp/regexp-syntax.html5 poiTag = ["id","name","type","typecode","biz_type","address","location","tel","pname","cityname","adname"] #返回結果控制為base時,輸出的POI標簽類別 6 poiSoupTag = ["idSoup","nameSoup","typeSoup","typecodeSoup","biz_typeSoup","addressSoup","locationSoup","telSoup","pnameSoup","citynameSoup","adnameSoup"] #包裝對應的Soup7 pattern = re.compile("(?:>)(.*?)(?=<)",re.S) # 組織正則表達式 8 poiExcel =xlwt.Workbook() # 新建工作簿 9 sheet = poiExcel.add_sheet("poiResult") # 新建“poiResult”的工作表 10 for colIndex in range(len(poiTag)): 11 sheet.write(0,colIndex,poiTag[colIndex]) # 寫表頭 12 offset = 10 # 實例設置每頁展示10條POI(官方限定25條) 13 maxPage = 10 #設置最多頁數為10頁(官方限定100頁) 14 types = "090000" # 示例類別為醫療保健服務POI,下載:http://a.amap.com/lbs/static/zip/AMap_poicode.zip 15 city = "440305" # 示例類別為深圳市南山區,下載:http://a.amap.com/lbs/static/zip/AMap_adcode_citycode.zip 16 for pageIndex in range(1, maxPage + 1): 17 try: 18 url = "http://restapi.amap.com/v3/place/text?&keywords=&types=" + types + "&city=" + city + "&citylimit=true&output=xml&offset=" + str(offset) + "&page="+ str(pageIndex) + "&key=你的key&extensions=base" 19 # 請求的結構化url地址如上;請使用自己的key,見:http://lbs.amap.com/api/webservice/guide/api/search/ 20 poiSoup = BeautifulSoup(urllib.request.urlopen(url).read(),"xml") #讀入對應頁碼的頁面 21 for tagIndex in range(len(poiTag)): 22 poiSoupTag[tagIndex] = poiSoup.findAll(poiTag[tagIndex]) # 根據Tag讀對應頁碼的POI標簽內容 23 for rowIndex in range(len(poiSoupTag[0])): 24 for colIndex in range(len(poiSoupTag)): 25 sheet.write(len(poiSoupTag[0]) * (pageIndex - 1) + rowIndex + 1, colIndex, re.findall(pattern,str(poiSoupTag[colIndex][rowIndex]))) 26 # 根據正則表達式提取內容,並在對應行與列寫入 27 except Exception as e: 28 print(e) # 設置錯誤輸出 29 poiExcel.save("E:/POI&" + types + "&" + city + ".xls") # 保存 30 print("Done!") # 結束
註:頁面過大時,部分單元格有概率出現重寫錯誤(猜測和原頁面每頁數據不完整有關),因為設置了報錯,不影響運行。但會導致極小部分POI丟失。
Python——使用高德API獲取POI(以深圳南山醫療保健服務POI為例)