爬蟲--BeautifulSoup簡單案例

阿新 • • 發佈：2018-11-11

1.以爬取簡書首頁標題為例

# coding:utf-8
import requests
from bs4 import BeautifulSoup

# 簡書首頁title爬取
class SoupSpider:
    def __init__(self):
        self.session = requests.Session()

    def jian_shu_spider(self, url, headers):
        response = requests.get(url, headers=headers).text
        # 將獲取到的內容轉換成BeautifulSoup格式
        soup = BeautifulSoup(response, "lxml")
        # 查詢所有class="title"的語句
        title_list = soup.find_all(class_= "title")
        for tit in title_list:
            title = tit.text
            print("文章標題：{}".format(title))

if __name__ == '__main__':
    soup_spider = SoupSpider()
    soup_spider.jian_shu_spider(
        "http://www.jianshu.com",
        {
        "Referer": "https://www.jianshu.com/",
        "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36"
        }
    )

2.爬取結果

爬蟲--BeautifulSoup簡單案例

1.以爬取簡書首頁標題為例 # coding:utf-8 import requests from bs4 import BeautifulSoup # 簡書首頁title爬取 class SoupSpider: def __init__(self): self.ses

爬蟲--Lxml簡單案例

1.以爬取簡書首頁標題為例 import requests from lxml import etree # 簡書首頁title爬取 class LxmlSpider: def __init__(self): self.session = requests.Sessio

從第一個爬蟲建立起做蟲師的心，request物件，簡單使用，構造簡單的裝置請求頭，爬蟲簡單案例篇（2）

from urllib.request import urlopen from urllib.request import Request url ='http://www.baidu.com/' h

爬蟲：requests & BeautifulSoup 實戰案例

爬取貓途鷹旅遊網站：https://www.tripadvisor.cn/Attractions-g60763-Activities-New_York_City_New_York.html景點資訊 from bs4 import BeautifulSoup import requests

爬蟲知識3：seletors選擇器、Xpath、 BeautifulSoup使用案例

java爬蟲：jsoup的簡單案例

package jsoup;import java.io.IOException;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.

python爬蟲--BeautifulSoup的簡單用法

#coding=utf-8 import urllib import urllib2 import cookielib from bs4 import BeautifulSoup import re url ="http://www.baidu.com" try: request = ur

java爬蟲京東商品頁簡單案例

要爬的資料資料庫表結構資料庫建表語句SET FOREIGN_KEY_CHECKS=0; -- ---------------------------- -- Table structure for `spider` -- ---------------------------- DROP TABLE I

java爬蟲--jsoup簡單的表單抓取案例

分析需求：某農產品網站的農產品價格抓取頁面展示如上: 標籤展示如上: 分析發現每日價格行情包括了蔬菜，水果，肉等所有的資訊，所以直接抓每日行情的內容就可以實現抓取全部資料。軟體環境：ec

Scrapy 爬蟲框架入門案例詳解

tin mon setting 爬蟲框架 finished perror project 原因 create 歡迎大家關註騰訊雲技術社區-博客園官方主頁，我們將持續在博客園為大家推薦技術精品文章哦~ 作者：崔慶才 Scrapy入門本篇會通過介紹一

C#正則表達式簡單案例解析

class sss 枚舉字符串的操作 option 完全匹配裏的需要業務正則表達式主要用於字符串的操作。 1.Regex.IsMatch:判斷指定的字符串是否符合正則表達式。 2.Regex.Match:提取匹配的字符串，只能提取到第一個符合的字符串。這裏還可以使

集合簡單案例

random color ava move ast cnblogs static rand 關於 package com.oracle.Test; import java.util.ArrayList; import java.util.Collec

Python 爬蟲-BeautifulSoup

nbsp des 字典 ren 轉換成 comment 第一個 cnblogs color 2017-07-26 10:10:11 Beautiful Soup可以解析html 和 xml 格式的文件。 Beautiful Soup庫是解析、遍歷、維護“標簽樹”的功能庫。使

Python之for循環簡單案例

登錄 bre http wid python for pre count1 pass 編寫登錄接口：輸入用戶名及用戶命名認證成功後，顯示歡迎信息認證失敗3次後，退出程序寫一個循環，重要的思路清晰，必然需要邏輯圖。 #!/usr/bin/env python#-*-

爬蟲——Scrapy框架案例一：手機APP抓包

debug domain hone targe allow topic document more ebs 以爬取鬥魚直播上的信息為例： URL地址：http://capi.douyucdn.cn/api/v1/getVerticalRoom?limit=20&of

爬蟲——Scrapy框架案例二：陽光問政平臺

web url地址 blog rem idt xpath disable ora ole 陽光熱線問政平臺 URL地址：http://wz.sun0769.com/index.php/question/questionType?type=4&page= 爬取字段：帖

ng-repeat循環輸出簡單案例

del tco app 商品機械 ng-click car rip ant <!doctype html> <html ng-app> <head> <meta charset="utf-8"> <t

Android 簡單案例：繼承BaseAdapter實現Adapter

for ack import apt ret bsp position hang layout import android.view.LayoutInflater; import android.view.View; import android.view.ViewGr

Android 簡單案例：onSaveInstanceState 和 onRestoreInstanceState

ted bsp raw hand current div set for hot import android.app.Activity; import android.os.Bundle; import android.view.View; import android

Android 簡單案例：可移動的View

bool fst boolean store import cup tcl etc last CrossCompatibility.rar 1. VersionedGestureDetector.java import android.content.Context; i

爬蟲--BeautifulSoup簡單案例

相關推薦