拉勾網
阿新 • • 發佈:2017-10-20
col 工作 data trac python short pos dict 1.0 requests模塊報錯無屬性get:文件名與某個Python庫名相同
****************************************分割線****************************************
拉勾網深圳的Python工作:
import requests
from openpyxl import Workbook
info=[]
s=requests.session()
s.get(‘https://www.lagou.com/jobs/list_Python?fromSearch=true‘)
s.cookies[‘LGUID‘]=s.cookies[‘user_trace_token‘] #反爬の行為分析 :訪問4頁之後添加個LGUID
headers={‘User-Agent‘:‘Mozilla/5.0 Chrome/61.0.3163.100 Safari/537.36‘,
‘Referer‘:‘https://www.lagou.com/jobs/list_Python?fromSearch=true‘}
url=‘https://www.lagou.com/jobs/positionAjax.json‘
for page in range(1,10):
print(‘begin to handle page of %s‘ %page)
data=dict(city=‘深圳‘,kd=‘Python‘,pn=‘%s‘ %page)
response=s.post(url,data=data,headers=headers).json()
jobs=response[‘content‘][‘positionResult‘][‘result‘]
for job in jobs:
workplace=job[‘city‘]
salary=job[‘salary‘]
positionName=job[‘positionName‘]
industryField=job[‘industryField‘]
companySize = job[‘companySize‘]
shortName=job[‘companyShortName‘]
fullName=job[‘companyFullName‘]
companyLabelList=‘,‘.join(job[‘companyLabelList‘])
info.append([workplace,salary,positionName,industryField,\
companySize,shortName,fullName,companyLabelList])
wb = Workbook()
ws = wb.active
ws.append([‘城市‘,‘薪資‘,‘職位‘,‘領域‘,‘規模‘,‘簡稱‘,‘全稱‘,‘福利‘])
for x in info:
ws.append(x)
wb.save(‘E:\拉勾網.xlsx‘)
****************************************分割線****************************************
666
拉勾網