Python selenium PIL 全網頁滾動截圖 && headless全網頁截圖

阿新 • • 發佈：2021-10-20

思路

先擷取當前螢幕的圖片，獲取其高度作為base高度 h，再獲取全網頁body到尾部的高度 H ，迴圈擷取圖片，再通過PIL進行拼接。

程式碼

# -*- coding:utf-8 -*-
# author: [email protected]
# software: PyCharm
import os

from PIL import Image
from time import sleep


class ScreenShot:
    __JS__ = {
        'scroll_to_bottom': "window.scroll({top:document.body.clientHeight,left:0,behavior:'auto'});",
        'scroll_to_y': "window.scroll({top:%d,left:0,behavior:'auto'});",
    }
    __base_end__ = 'tmp_end.png'
    __scroll_bottom__ = 'scroll_to_bottom'
    __scroll_y__ = 'scroll_to_y'
    __body__ = '//body'
    __height__ = 'height'
    __clear_shell__ = 'rm -rf *.png'
    __RGB__ = 'RGB'

    @classmethod
    def screen_shot(cls, driver, title, uploader_url='', delete=False):
        """
        全網頁滾動截圖
        :param driver: webdriver 示例
        :param title: 標題（最終圖片命名）
        :param uploader_url: 上傳url
        :param delete: 是否清除所有圖片
        :return:
        """
        base_image = '{}.png'.format(title)
        driver.save_screenshot(base_image)
        body_h = int(driver.find_element_by_xpath(cls.__body__).size.get(cls.__height__))
        current_h = Image.open(base_image).size[1] / 2
        for i in range(1, int(body_h / current_h)):
            driver.execute_script(cls.__JS__[cls.__scroll_y__] % (current_h * i))
            sleep(.5)
            driver.save_screenshot(f'tmp_{i}.png')
            cls.__join_images__(base_image, f'tmp_{i}.png', 0, base_image)
        driver.execute_script(cls.__JS__[cls.__scroll_bottom__])
        driver.save_screenshot(cls.__base_end__)
        cls.__join_images__(base_image, cls.__base_end__, int(current_h - int(body_h % current_h)), base_image)
        # TODO 上傳圖片
        url = ''
        # 移除圖片
        if delete:
            os.system(cls.__clear_shell__)
        return url

    @classmethod
    def __join_images__(cls, png1, png2, size=0, output='result.png'):
        """
        圖片拼接
        :param png1: 圖片1
        :param png2: 圖片2
        :param size: 兩個圖片重疊的距離
        :param output: 輸出的圖片檔案
        :return:
        """
        size = size * 2
        img1, img2 = Image.open(png1), Image.open(png2)
        size1, size2 = img1.size, img2.size
        joint = Image.new(cls.__RGB__, (size1[0], size1[1] + size2[1] - size))
        loc1, loc2 = (0, 0), (0, size1[1] - size)
        joint.paste(img1, loc1)
        joint.paste(img2, loc2)
        joint.save(output)


if __name__ == '__main__':
    from selenium import webdriver
    driver = webdriver.Chrome()
    driver.get("https://www.cnblogs.com/worldline/")
    ScreenShot.screen_shot(driver, 'worldline')
    driver.quit()

其他

如果是在headless模式,可以使用


def get_image(url, pic_name):
    """
    適用於無頭全屏截圖
    :param url: url訪問路徑
    :param pic_name: 圖片名稱
    :return:
    """
    chrome_options = Options()
    chrome_options.add_argument('headless')
    driver = webdriver.Chrome(options=chrome_options)
    driver.get(url)
    time.sleep(.5)
    width = driver.execute_script("return document.documentElement.scrollWidth")
    height = driver.execute_script("return document.documentElement.scrollHeight")
    print(width, height)
    driver.set_window_size(width, height)
    time.sleep(.5)
    driver.save_screenshot(pic_name)
    driver.close()

Python selenium PIL 全網頁滾動截圖 && headless全網頁截圖

思路先擷取當前螢幕的圖片，獲取其高度作為base高度 h，再獲取全網頁body到尾部的高度 H ，迴圈擷取圖片，再通過PIL進行拼接。

Python+Selenium+PIL+Tesseract真正自動識別驗證碼進行一鍵登入

Python 2.7 IDE Pycharm 5.0.3 Firefox瀏覽器：47.0.1 PIL : Pillow-3.3.0-cp27-cp27m-win_amd64.whl PIL第三方庫的下載 win下安裝whl檔案

Python Selenium 網頁截全圖

Python Selenium 網頁截全圖程式碼如下： from selenium import webdriver from selenium.webdriver.common.by import By

Python+Selenium+phantomjs實現網頁模擬登入和截圖功能(windows環境)

本文全部操作均在windows環境下安裝 Python Python是一種跨平臺的計算機程式設計語言，它可以執行在Windows、Mac和各種Linux/Unix系統上。是一種面向物件的動態型別語言，最初被設計用於編寫自動化指令碼(shell)，隨

Selenium基於PIL實現拼接滾動截圖

Selenium預設的截圖save_screenshot只支援對當前視窗內容進行截圖，當如果你想要擷取整個網頁，那麼，可以明確的告訴你。

python selenium firefox 截全網頁

Python 3.8.2 selenium==3.141.0 requirements import logging import logging.config import unittest from selenium import webdriver

Windows下 Python Selenium PhantomJS 抓取網頁並截圖

安裝Python https://www.python.org/downloads/release下載安裝將Python目錄加入PATH 安裝SetupTools

Python Selenium截圖功能實現程式碼

目標：執行之後會在D盤生成一個jt+當前時間.png，該圖片為百度首頁截圖一 get_screenshot_as_file

網頁滾動截圖怎麼截長圖

網頁滾動截圖怎麼截這個問題我也被好多同事問到過，其實現在都2020年了有非常多很好用了的線上工具，用不著再去安裝各種外掛下載各種軟體來實現網頁滾動截圖啦。下面就給大家推薦一個我常用的線上網頁滾動

python+selenium截圖

get_screenshot_as_file() save_screenshot() get_screenshot_as_file() 該方式通過driver獲取該方法，將截圖要儲存的路徑寫入，如果圖片格式未新增.png，會返回False

python+selenium自動化報告HTMLTestRunner增加截圖功能

最近看了很多關於HTMLTestRunner生成報告增加截圖功能的部落格，但是講得都不太清楚，然後自己花點時間整理後如下：

Python Selenium 自動化實現截圖操作

一、今天小編就為大家分享一篇對 Python 獲取螢幕截圖的 3 種方法詳解 1、採用 selenium 中的兩種截圖方法

Python selenium 截長圖

1.擷取長圖注：selenium 必須開啟無介面模式 from selenium import webdriver import time options = webdriver.ChromeOptions()

python+selenium+PhantomJS抓取網頁動態載入內容

環境搭建準備工具：pyton3.5,selenium,phantomjs 我的電腦裡面已經裝好了python3.5 安裝Selenium

python selenium 關於將網頁打包為靜態網頁（mhtml）下載。

需求：單純的將page.source寫入檔案的方式，會導致一些圖片無法顯示，對於google瀏覽器，直接將頁面打包下載成一個mhtml格式的檔案，則可以進行離線下載。對應python selenium 微信公眾號歷史文章隨手一點就返回首頁

Python selenium如何打包靜態網頁並下載

python Selenium 和 PyAutoGUI合璧爬取網頁攻略

前一段時間在做關於美國請願網站的研究，需要爬取change.org這個請願網站上每個請願的資訊。大致爬蟲順序是：先爬取每個標籤下所有請願的名字和具體網址，訪問每個具體網址爬取請願的發起時間、內容等資訊。這裡就需

python + selenium 爬蟲模擬登入破解無原圖滑動驗證碼

爬蟲模擬登入破解無原圖滑動驗證碼：https://www.cnblogs.com/98WDJ/p/11050559.html 需求：部分網站在頻繁的使用之後，會彈出滑塊驗證碼（極驗）。有別於過去，現在的原圖並不會出現，因此較過去的思路轉變為以下：

python+selenium動態抓取網頁資料

window+python+selenium 1.下載selenium cmd pip3 instatll selenium 2.下載瀏覽器對應驅動版本檢視瀏覽器版本：chrome://version

Python+selenium點選網頁上指定座標

技術標籤：selenium/appium from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains

Python selenium PIL 全網頁滾動截圖 && headless全網頁截圖

相關推薦