Python3爬蟲04（其他例子，如處理獲取網頁的內容）

阿新 • • 發佈：2018-12-16

ont htm file tle imp 獲取url con images 其他

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import os
import re
import requests
from bs4 import NavigableString
from bs4 import BeautifulSoup

res=requests.get("https://www.qiushibaike.com/")
qiushi=res.content
soup=BeautifulSoup(qiushi,"html.parser")
duanzis=soup.find_all(class_="content")
for i in duanzis:
    duanzi=i.span.contents[0]
    # duanzi=i.span.string
    print(duanzi)
    # print(i.span.string)


res=requests.get("http://699pic.com/sousuo-218808-13-1-0-0-0.html")
image=res.content
soup=BeautifulSoup(image,"html.parser")
images=soup.find_all(class_="lazy")

for i in images:
    original=i["data-original"]
    title=i["title"]
    # print(title)
    # print(original)
    # print("")
    try:
        with open(os.getcwd()+"\\jpg\\"+title+‘.jpg‘,‘wb‘) as file:
            file.write(requests.get(original).content)
    except:
        pass

r = requests.get("http://699pic.com/sousuo-218808-13-1.html")
fengjing = r.content
soup = BeautifulSoup(fengjing, "html.parser")
# 找出所有的標簽
images = soup.find_all(class_="lazy")
# print images # 返回list對象

for i in images:
    jpg_rl = i["data-original"]  # 獲取url地址
    title = i["title"]           # 返回title名稱
    print(title)
    print(jpg_rl)
    print("")

r = requests.get("https://www.qiushibaike.com/")
r=requests.get("http://www.cnblogs.com/nicetime/")
blog=r.content
soup=BeautifulSoup(blog,"html.parser")
soup=BeautifulSoup(blog,features="lxml")
print(soup.contents[0].contents)


tag=soup.find(‘div‘)
tag=soup.find(class_="menu-bar menu clearfix")
tag=soup.find(id="menu")
print(list(tag))

tag01=soup.find(class_="c_b_p_desc")

print(len(list(tag01.contents)))
print(len(list(tag01.children)))
print(len(list(tag01.descendants)))

print(tag01.contents)
print(tag01.children)
for i in tag01.children:
    print(i)


print(len(tag01.contents))

for i in tag01:
    print(i)

print(tag01.contents[0].string)
print(tag01.contents[1])
print(tag01.contents[1].string)


url = "http://www.dygod.net/html/tv/oumeitv/109673.html"
s = requests.get(url)
print(s.text.encode("iso-8859-1").decode(‘gbk‘))
res = re.findall(‘href="(.*?)">ftp‘,s.text)
for resi in res:
    a=resi.encode("iso-8859-1").decode(‘gbk‘)
    print(a)

ont htm file tle imp 獲取url con images 其他 #!/usr/bin/env python# -*- coding:utf-8 -*-import osimport reimport requestsfrom bs4 import Navi

Kali+mitmproxy 超級詳細的ssl劫持和窺竊動妹子上網動態（劫包，返回想要的內容）

這裡提供每一步詳細操作，由Hui3c編寫，前排求關注通過本次的學習，你可以知道或學習到：使用mitmproxy進行簡單的ssl劫持使用mitmproxy進行瀏覽器隱私監聽 mitmproxy簡單用法 mitmproxy劫包，修改包一、準備階段 1

python3傳送郵件02（簡單例子，帶附件）

#!/usr/bin/env python# -*- coding:UTF-8 -*- import osimport smtplibfrom email.header import Headerfrom email.mime.text import MIMETextfrom email.mime.mult

手把手教你如何在window下將jenkins+allure整合生成的測試報告通過jenkins配置郵箱自動傳送-04（非常詳細，非常實用）

簡介　　上一篇生成測試報告，小夥伴們和童鞋們就又問道，測試報告已經生成了，怎麼傳送給相關的負責人了？小夥伴們和童鞋們不要著急，聽巨集哥慢慢給你道來，心急吃不了熱豆腐哈。這些小夥伴們的表現還是不錯的，還有表現差一點的小夥伴或者童鞋們，竊竊自喜，以為萬事大吉了，NO，還差一步，不把測試報告發出去好好地在領導面

用Html5/CSS3做Winform，一步一步教你搭建CefSharp開發環境（附JavaScript異步調用C#例子，及全部源代碼）上

轉載界面設計右鍵異步一個由於編寫 scrip 調用本文為雞毛巾原創，原文地址：http://www.cnblogs.com/jimaojin/p/7077131.html，轉載請註明 CefSharp說白了就是Chromium瀏覽器的嵌入式核心，我們用此開發W

關於Chrome出現Provisional headers are shown無法正常訪問的解決方案（其他firefox，360， IE訪問正常）

解決方案：安裝最新chrome瀏覽器（目前最新的版本 70.0.3538.77（正式版本）（64 位），具體的請以最新為準），即可以正常訪問，雖然還是會出現Provisional headers are shown，但是不影響訪問。瀏覽器的解除安裝更新安裝可以參考後面的連線，解除安裝

python3爬蟲入門（urllib和requests簡單使用）

知道python有強大的的爬蟲庫，但是對於我們普通小白來說，寫一個完整的爬蟲需要知道什麼甚至瞭解什麼都是很重要的。掌握了這些基本點，才能夠熟悉爬蟲的構成和獲取有用的資訊。編寫一個小爬蟲個人感覺可以分為三個階段： 1：請求，這個就是使用urlib2或者requests

python3爬蟲03（find_all用法等）

#read1.html檔案# <html><head><title>The Dormouse's story</title></head># <body># <p class="title"><b>The Dorm

Python3入門上（適合新手，一篇足夠）

python3入門篇上 Python 是一門有條理的和強大的面向物件的程式設計語言、一種高層次的結合瞭解釋性、編譯性、互動性和麵向物件的指令碼語言。優勢易於學習：Python 有相對較少的關鍵字，結構簡單，和一個明確定義的語法，學習起來更加簡單。易於閱讀：

python3 爬蟲日記（二）將資料存到Mongodb

python版本：3.6.1 開發工具：PyCharm社群版，Anaconda3 資料庫：MongoDB 視覺化MongoDB工具：MongoVUE 1.開啟資料庫後，開啟MongoVUE使MongoDB視覺化。 2.用PyCharm編寫程式碼，爬取資料並儲存到資料庫中。

java常用集合類詳解（有例子，集合類糊塗的來看！）

TreeSet：TreeSet是依靠TreeMap來實現的.TreeSet是一個有序集合,TreeSet中元素將按照升序排列,預設是按照自然排序進行排列,意味著TreeSet中元素要實現Comparable介面.我們可以在構造TreeSet物件時,傳遞實現了Comparator介面的比較器物件.java.ut

Python3爬蟲實戰（requests模組）

上次我通過兩個實戰教學展示瞭如何使用urllib模組（http://blog.csdn.net/mr_blued/article/details/79180017）來構造爬蟲，這次告訴大家一個更好的實現爬蟲的模組，requests模組。使用requests模組進行爬蟲構造時最

Spyder(tensorflow）下使用其他module，如matplotlib

安裝Anaconda：清華源 https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/ 對應版本 WINDOWS環境：進入Anaconda Prompt 啟用tensorflow： activa

Python入門學習筆記————04（while迴圈，函式）

while 迴圈一個迴圈語句某條件成立時迴圈不確定具體的迴圈次數，但能夠知道就具體的迴圈條件就用while 語法 while 條件表示式：語句塊 while ... ellse... wh

Python3爬蟲實戰（urllib模組）

import urllib.request import os import re import time def url_open(url): # 建立一個 Request物件 req req = urllib.request.Request(url) # 通過 add_head

python3爬蟲初探（四）之檔案儲存

接著上面的寫，抓取到網址之後，我們要把圖片儲存到本地，這裡有幾種方法都是可以的。　　#-----urllib.request.urlretrieve----- import urllib.request imgurl = 'http://img.ivsky.com/

hbase命令集（shell 命令，如建表，清空表，增刪改查）

　兩篇可以參考的文章，講的不錯 http://www.cnblogs.com/nexiyi/p/hbase_shell.html (http://blog.iyunv.com/wulantian/article/details/41011297)　　=============

python3爬蟲實戰（三）：mitmproxy對接python下載抖音小視訊

一、前言前面我們已經用appium爬取了微信朋友圈，今天我們學習下mitmproxy，mitmproxy是幹什麼的呢，它跟charles和fiddler類似，是一個抓包工具，以控制檯的形式顯示，mitmproxy的重要性在於它可以對接python,可

在網頁端開啟mui製作的app，（其他的app不知道要怎麼做）

用mui做好app後，需要推廣的，比如在網頁中有一個按鈕，點選這個按鈕會發生兩件事（其中一件），如果客戶手機安裝了我們的app，那麼就直接開啟app，如果客戶沒有安裝那麼就跳轉到下載頁面去；分為兩步一，在打包app的時候Android平臺通過UrlSchemes與第三方應用相

java爬蟲（使用jsoup設定代理，抓取網頁內容）

jsoup 簡介 jsoup 是一款Java 的HTML解析器，可直接解析某個URL地址、HTML文字內容。它提供了一套非常省力的API，可通過DOM，CSS以及類似於jQuery的操作方法來

Python3爬蟲04（其他例子，如處理獲取網頁的內容）

相關推薦