python 手動給requests模組新增urlretrieve下載檔案方法！

阿新 • • 發佈：2019-01-05

requests模組的前代是urllib模組，傳入引數headers、cookie、data什麼的肯定是requests好使，但是卻沒有urllib.request.urlretrieve這個方法，

urlretrieve(url, filename=None,reporthook=None, params=None,)

傳入url跟檔案路徑即可下載檔案，requests次次都得自己手動編寫，我覺得太麻煩了，而且它還有個回撥函式，我試著能不能把這個urlretrieve方法移植到requests模組來

要點：1.如何找到自己想要的python模組呢？cmd上打path，找出python的，然後CTRL+F

2.下載檔案的實質是 contextlib.closing開啟網頁--->with open檔案--->寫入

3.reporthook回撥函式的實質就是把檔案一段段寫入檔案時把3個引數(每次寫入bytes的數量、次數、headers得到的總大小size)傳出去，讓回撥函式處理

4.原來現在的模組，都是其他py檔案寫好方法，然後把其方法傳入__init__.py這個檔案的

5.r.iter_content()的應用

進入urllib資料夾，在request檔案中找到urlretrieve方法，具體如下

def urlretrieve(url, filename 
=None, reporthook=None, data=None):
"""
    Retrieve a URL into a temporary location on disk.
    Requires a URL argument. If a filename is passed, it is used as
    the temporary file location. The reporthook argument should be
    a callable that accepts a block number, a read size, and the
    total file size of the URL target. The data argument should be
 
    valid URL encoded data.
    If a filename is passed and the URL points to a local resource,
    the result is a copy from local file to new file.
    Returns a tuple containing the path to the newly created
    data file as well as the resulting HTTPMessage object.
    """
url_type, path = splittype(url)       #分析網頁的，忽略

    with contextlib.closing(urlopen(url, data)) as fp: #開啟網頁
headers = fp.info()              #頭

        # Just return the local path and the "headers" for file://
        # URLs. No sense in performing a copy unless requested.
if url_type == "file" and not filename:
            return os.path.normpath(path), headers #忽略

        # Handle temporary file setup.
if filename:
tfp = open(filename, 'wb')                #開啟檔案
        else:
tfp = tempfile.NamedTemporaryFile(delete=False)#忽略
            filename = tfp.name
            _url_tempfiles.append(filename)

        with tfp:
result = filename, headers
            bs = 1024*8                              #每一次寫入bytes的大小size = -1
read = 0
blocknum = 0                             #寫入bytes的次數，2者相乘就是已經寫入的大小if "content-length" in headers:
size = int(headers["Content-Length"]) #size就是檔案大小了

            if reporthook:
reporthook(blocknum, bs, size)       #寫入前執行一次回撥函式

            while True:
block = fp.read(bs)
                if not block:
                    break
read += len(block)
                tfp.write(block)                     #寫入
                blocknum += 1
if reporthook:
reporthook(blocknum, bs, size)   #每寫入一次就執行一次回撥函式


    if size >= 0 and read < size:
        raise ContentTooShortError(
            "retrieval incomplete: got only %i out of %i bytes"
% (read, size), result)

    return result

然而常規的requests模組下載檔案的寫法：

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.79 Safari/537.36'}
with closing(requests.get(url=target, stream=True, headers=headers)) as r:
    with open('%d.jpg' % filename, 'ab+') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk:
f.write(chunk)
                f.flush()

現在把它給封裝起來：

def urlretrieve(url, filename=None,reporthook=None, params=None,):
'''傳入ID改變url，利用closing跟iter_content下載圖片'''
headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.79 Safari/537.36'}
    with contextlib.closing(requests.get(url, stream=True, headers=headers,params=params)) as fp:#開啟網頁
header=fp.headers                                          #得出頭
with open(filename, 'wb+') as tfp:  #w是覆蓋原檔案，a是追加寫入#開啟檔案
bs = 1024
size = -1
blocknum = 0
if "content-length" in header:
size = int(header["Content-Length"])                #檔案的總大小理論值
if reporthook:
reporthook(blocknum, bs, size)                      #寫入前執行一次回撥函式
for chunk in fp.iter_content(chunk_size=1024):
                if chunk:
tfp.write(chunk)                                #寫入
tfp.flush()
                    blocknum += 1
if reporthook:
reporthook(blocknum, bs, size)              #每寫入一次就執行一次回撥函式

測試：

def Schedule(a, b, c):
per = 100.0*a*b/c
if per > 100 :
per = 100
sys.stdout.write("  " + "%.2f%% 已經下載的大小:%ld 檔案大小:%ld" % (per,a*b,c) + '\r')
    sys.stdout.flush()

url='https://images.unsplash.com/photo-1503025768915-494859bd53b2?ixlib=rb-0.3.5&q=85&fm=jpg&crop=entropy&cs=srgb&dl=tommy-344440-unsplash.jpg&s=1382cd0338e13f6460ed68182d35cac9'
urlretrieve(url=url,filename='111.jpg',reporthook=Schedule)

OK，成功

現在把這個方法放進requests模組裡面，

先在requests資料夾裡把剛寫的方法放在api.py裡的最末

還有把import contextlib 寫上

然後在__init__.py這個檔案，api import後面加上urlretrieve

OK，可以直接執行

import requests,os,time,sys

def Schedule(a, b, c):
per = 100.0*a*b/c  #a是寫入次數，b是每次寫入bytes的數值，c是檔案總大小
if per > 100 :
per = 100
sys.stdout.write("  " + "%.2f%% 已經下載的大小:%ld 檔案大小:%ld" % (per,a*b,c) + '\r')
    sys.stdout.flush()

url='https://images.unsplash.com/photo-1503025768915-494859bd53b2?ixlib=rb-0.3.5&q=85&fm=jpg&crop=entropy&cs=srgb&dl=tommy-344440-unsplash.jpg&s=1382cd0338e13f6460ed68182d35cac9'
requests.urlretrieve(url=url,filename='111.jpg',reporthook=Schedule)

python 手動給requests模組新增urlretrieve下載檔案方法！

requests模組的前代是urllib模組，傳入引數headers、cookie、data什麼的肯定是requests好使，但是卻沒有urllib.request.urlretrieve這個方法，urlretrieve(url, filename=None,reportho

python用 requests 模組從 Web 下載檔案

requests 模組讓你很容易從 Web 下載檔案，不必擔心一些複雜的問題，諸如網路錯誤、連線問題和資料壓縮。requests 模組不是 Python 自帶的，所以必須先安裝。通過命令列，執行 pip install requests。編寫 requests 模組是因為 P

python--- bs4和requests模組

1.bs4模組 bs4庫是解析、遍歷、維護、“標籤樹“的功能庫。通俗一點說就是： bs4庫把html原始碼重新進行了格式化，從而方便我們對其中的節點、標籤、屬性等進行操作。獲取標籤內容 from bs4 import BeautifulSoup # 構造物件

Python中的requests模組

Python中的Requests模組 Requests模組是一個用於網路訪問的模組，類似的模組有urllib，urllib2，httplib，httplib2等，但由於其訪問http時的人性化，便於操作，深受人們喜歡。在爬蟲中常使用的模組：獲取網頁內容的----- urlli

python - 怎樣使用 requests 模組傳送http請求

最近在學python自動化，怎樣用python發起一個http請求呢？通過了解 request 模組可以幫助我們發起http請求步驟：　　1.首先import 下 request 模組　　2.然後看請求的方式，選擇對應的請求方法　　3.接受返回的報文資訊例子：get 方法　　imp

孤荷凌寒自學python第六十七天初步瞭解Python爬蟲初識requests模組

孤荷凌寒自學python第六十七天初步瞭解Python爬蟲初識requests模組（完整學習過程螢幕記錄視訊地址在文末）從今天起開始正式學習Python的爬蟲。今天已經初步瞭解了兩個主要的模組： requests BeautifulSoup 一

Python爬蟲——利用requests模組爬取妹子圖

近期學了下python爬蟲，利用requests模組爬取了妹子圖上的圖片，給單身狗們發波福利，哈哈！順便記錄一下第一次發部落格。話不多說，進入正題開發環境 python 3.6 涉及到的庫 requests lxml 先上一波爬取的截圖

Python爬蟲之requests模組

獲取響應資訊 import requests response = requests.get('http://www.baidu.com') print(response.status_code) # 狀態碼 print(response.url) # 請求url print(respon

JavaScript中prototype（原型）給字串物件新增一個toCharArray的方法，reverse(翻轉)的方法

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http:/

Python示例程式碼之sftp上傳下載檔案

sftp的上傳下載是最為常見的功能之一，實現也很簡單，網上資料很多，但為了Python知識點的完整性，還是再描述了一遍。通常我們使用paramiko庫實現sft

urllib.urlretrieve下載檔案不可用原因

我的想法是通過python去網站下載圖片，以後下黃圖的話爽歪歪於是通過百度去試手按道理這個算是下載完成了，於是跑到D盤去看，有檔案，但是開啟就不行了沒用！！找原因吧，網上找了下沒碰到這中說說，難道是我訪問的連結不行，我再直接在瀏覽器上試下沒錯啊，那裡出問題

Python抓取網頁&批量下載檔案方法初探（正則表示式+BeautifulSoup）

最近兩週都在學習Python抓取網頁方法，任務是批量下載網站上的檔案。對於一個剛剛入門python的人來說，在很多細節上都有需要注意的地方，以下就分享一下我在初學python過程中遇到的問題及解決方法。一、用Python抓取網頁基本方法： import urllib

python第一個指令碼，模擬瀏覽器下載檔案

用wget命令下載檔案總是失敗。用python指令碼模擬瀏覽器下載，程式碼如下： #!/bin/python # -*- coding: utf-8 -*- __author__ = 'wulong' import sys from urllib import Fanc

Linux核心模組新增的兩種方法

Linux核心模組新增的兩種方法靜態載入：把元件都新增進核心檔案中，在目錄kongfig檔案中增加新程式碼對應的編譯選項，在Makefile檔案中新增編譯條目。動態載入：

Python爬蟲判斷url連結的是下載檔案還是html檔案

最近在寫一個網路爬蟲的程式碼，提供命令列來下載檔案或者是列印根域名下指定節點及深度的子節點。用的是urllib2庫，算是比較簡單，但是功能並沒有很強大。說重點吧，在實際爬網頁的過程中，一般的過程是一次呼叫下面的三個函式： req = urllib2.Request(ur

java後臺Controller下載檔案方法

/** * 匯出 * @param request * @param response */ &n

純程式碼實用教材，python的面向物件思維與類的一些方法！

面向物件(Object Oriented,OO)是軟體開發方法。面向物件的概念和應用已超越了程式設計和軟體開發，擴充套件到如資料庫系統、互動式介面、應用結構、應用平臺、分散式系統、網路管理結構、CAD技術、人工智慧等領域。面向物件是一種對現實世界理解和抽象的方法，是計算機程式設計技術發展到一定階

JS 下載檔案方法分享(解決圖片檔案無法直接下載和 IE相容問題)

場景簡介由於業務需要，經常遇到下載各類檔案的需求，其中最頭疼的莫過於前端下載圖片了，直接給個圖片檔案地址會變成直接開啟圖片，而不是彈窗提示另存為，研究了下前端實現檔案下載最便捷的方法還是建立 a 標籤，寫入download 屬性實現點選下載，但這在 ie 瀏

乾貨大派送——Python的面向物件思維與類的一些方法！

有基礎或者是學過其他程式設計的小夥伴，想必對面向物件（Object Oriented,OO）一定不陌生。面向物件（Object Oriented,OO）是軟體開發的方法。它是一種對現實世界理解和抽象的方法。面向物件的概念和應用包含了程式設計、軟體開發、資料庫系統、互動式介面、應用結構、應用平臺、分散式系統、網

向雲伺服器上傳下載檔案方法彙總

[[email protected]_250_202_tlinux ~]# cat/etc/pam.d/vsftpd#%PAM-1.0 auth required /lib64/security/pam_listfile.soitem=user sense=deny file=/etc/ftpus

python 手動給requests模組新增urlretrieve下載檔案方法！

相關推薦