1. 程式人生 > >使用python訪問網頁

使用python訪問網頁

python版本:3

訪問頁面:

import urllib.request

url="https://blog.csdn.net/qq_33160790"
req=urllib.request.Request(url)
resp=urllib.request.urlopen(req)
data=resp.read().decode('utf-8')

print(data)

效果:
這裡寫圖片描述

from lxml import etree
import requests

url='https://blog.csdn.net/qq_33160790'
resp=requests.get(url)
if
resp.status_code==requests.codes.ok: html=etree.HTML(resp.text) hrefs=html.xpath('////span[@class="link_title"]/a/@href') for href in hrefs: print href

效果:
這裡寫圖片描述

打印出所有文章url:

from lxml import etree
import requests

for i in range(1,23):   #23 is equal to pagelist-1
#print(i) url='https://blog.csdn.net/qq_33160790/article/list/'+str(i) resp=requests.get(url) if resp.status_code==requests.codes.ok: html=etree.HTML(resp.text) hrefs=html.xpath('////span[@class="link_title"]/a/@href') for href in hrefs: print
href

這裡寫圖片描述

刷csdn點選指令碼:
PS:url和23結合實際修改

from lxml import etree
import requests
import urllib.request

for i in range(1,23):   #23 is equal to pagelist-1
        #print(i)
        url='https://blog.csdn.net/qq_33160790/article/list/'+str(i)
        resp=requests.get(url)
        if resp.status_code==requests.codes.ok:
                html=etree.HTML(resp.text)
                hrefs=html.xpath('////span[@class="link_title"]/a/@href')
                for href in hrefs:
                        print (href)
                        req=urllib.request.Request(href)
                        data=urllib.request.urlopen(req).read()