煎蛋網爬蟲之JS逆向解析img路徑

阿新 • • 發佈：2019-01-15

ces param 得到 exception res lex image pytho pool

圖片使用js onload事件加載

<img src="//img.jandan.net/img/blank.gif" onload="jandan_load_img(this)" />Ly93eDEuc2luYWltZy5jbi9tdzYwMC8wMDd1ejNLN2x5MWZ6NmVub3ExdHhqMzB1MDB1MGFkMC5qcGc=

找到soureces 文件中對應的js 方法jandan_load_img

技術分享圖片

通過debugger js 將Ly93eDEuc2luYWltZy5jbi9tdzYwMC8wMDd1ejNLN2x5MWZ6NmVub3ExdHhqMzB1MDB1MGFkMC5qcGc= 傳入函數jdugRtgCtw78dflFjGXBvN6TBHAoKvZ7xu base64_decode得到img路經

再通過正則表達式將img路徑中的(/W+)替換為large

爬取代碼如下：

import base64
import re
import requests
from concurrent.futures import ThreadPoolExecutor
from random import choice
from lxml import etree
from user_agent_list import USER_AGENTS
headers = {‘user-agent‘: choice(USER_AGENTS)}


def fetch_url(url):
    ‘‘‘
    :param url: 路徑
    :return: html
    ‘‘‘
    try:
        r = requests.get(url, headers=headers)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        if r.status_code in [200, 201]:
            return r.text
    except Exception as e:
        print(e)


def downloadone(url):
    html = fetch_url(url)
    data = etree.HTML(html)
    img_hash_list = data.xpath(‘//*[@class="img-hash"]/text()‘)
    for img_hash in img_hash_list:
        img_path = ‘http:‘ + bytes.decode(base64.b64decode(img_hash))
        img_path = re.sub(r‘mw\d+‘, ‘large‘, img_path)
        img_name = img_path.rsplit(‘/‘, 1)[1]
        with open(‘jiandan/‘+img_name, ‘wb‘) as f:
            r = requests.get(img_path)
            f.write(r.content)


def main():
    url_list = []
    for _ in range(1, 44):
        url = ‘http://jandan.net/ooxx/page-{}‘.format(_)
        url_list.append(url)
    with ThreadPoolExecutor(4) as executor:
       executor.map(downloadone, url_list)


if __name__ == ‘__main__‘:
    main()

煎蛋網爬蟲之JS逆向解析img路徑

ces param 得到 exception res lex image pytho pool 圖片使用js onload事件加載 <img src="//img.jandan.net/img/blank.gif" onload="jandan_lo

煎蛋網爬蟲之JS逆向解析img路徑

煎蛋網爬蟲之JS逆向解析img路徑

Python爬蟲之爬取煎蛋網妹子圖

python3 15行程式碼爬取煎蛋網大圖(原圖)--基礎逆向破解js-------------------玉米都督

爬蟲之煎蛋網妹子圖大爬哦

python爬蟲之反爬蟲情況下的煎蛋網圖片爬取初步探索

python爬蟲--下載煎蛋網妹子圖到本地

python3爬蟲爬取煎蛋網妹紙圖片

python學習第八十五天：網絡爬蟲之數據解析方式

python3.6.4爬取裁判文書網----------基本js逆向解析----玉米都督

Python爬蟲入門教程 18-100 煎蛋網XXOO圖片抓取

Python爬蟲入門教程，突破煎蛋網反爬措施，妹子圖批量抓取！

python 爬蟲爬取煎蛋網妹子圖

Python爬蟲(6):煎蛋網全站妹子圖爬蟲

用python來抓取“煎蛋網”上面的美女圖片，尺度很大哦！哈哈

利用C#爬取煎蛋網圖片

Python Scrapy 煎蛋網妹子圖例項

那些年，我爬過的北科(六)——反反爬蟲之js渲染

Python3 Scrapy框架學習三：爬取煎蛋網加密妹子圖片(全爬)

python爬取煎蛋網妹子圖，已解密圖片~~~~~

Python爬蟲之Beautiful Soup解析庫的使用（五）

煎蛋網爬蟲之JS逆向解析img路徑

相關推薦