(1).scrapy介紹

阿新 • • 發佈：2018-07-02

load self 命令 alt asp 技術分享 engine while rtp

scrapy startproject xxx

cd xxx

scrapy genspider xxxx xxxx.com

# -*- coding: utf-8 -*-
import scrapy


class ShiinaSpider(scrapy.Spider):
    name = ‘shiina‘
    allowed_domains = [‘mashiro.com‘]
    start_urls = [‘https://tieba.baidu.com/p/5290405550?red_tag=0653675634‘]

    def parse(self, response):
        # response：相應
        # 執行命令：scrapy crawl shiina --nolog,--log意思是不打印日誌
        print(response)
        print(response.url)
        print(response.text)  # 這裏不顯示了
    # 程序運行結果
    ‘‘‘
    <200 https://tieba.baidu.com/p/5290405550?red_tag=0653675634>
    https://tieba.baidu.com/p/5290405550?red_tag=0653675634

    ‘‘‘

　　技術分享圖片

每一個創建的spider都會具有一個起始url，當我們執行的時候scrapy engine會將連接放在scheduler裏面，然後往裏面取鏈接，交給downloader去下載，下載完了交給spider。spider對內容進行解析，然後既可以將內容交給pipline進行持久化，也可以將新的url繼續通過scrapy engine交給scheduler，然後繼續遞歸爬取。

可以把scrapy engine看成一個while循環，scheduler看成是一個隊列，scrapy engine不斷地從隊列裏面取url，交給下載器去下載

(1).scrapy介紹

load self 命令 alt asp 技術分享 engine while rtp scrapy startproject xxx cd xxx scrapy genspider xxxx xxxx.com # -*- coding: utf-8 -*- import

(1).scrapy介紹

(1).scrapy介紹

1.Solr介紹

手把手實現Java權限(1)-Shiro介紹

bootstrap 學習筆記（1）---介紹bootstrap和柵格系統

AngularJs學習筆記1——總體介紹

1.saltsock介紹—安裝

Nordic nRF52832 學習筆記（1）介紹，入門，與準備工作

開源項目Universal Image Loader for Android 說明文檔 (1) 簡單介紹

MongoDB(1)--簡單介紹以及安裝

Redis 實踐1- redis介紹和安裝

5.1 vim介紹 5.2 vim顏色顯示和移動光標 5.3 vim一般模式下移動光標 5.4 vim一般模式下復制、剪切和粘貼

8.1 shell介紹 8.2 命令歷史 8.3 命令補全和別名 8.4 通配符 8.5 輸入輸出重定向

1.1 linux介紹

RHCA CL220 CloudForms 3.1 架構介紹

14.1 NFS介紹 14.2 NFS服務端安裝配置 14.3 NFS配置選項

16.1 Tomcat介紹 - 16.2 安裝jdk - 16.3 安裝Tomcat

前端基於react,後端基於.net core2.0的開發之路（1）介紹

Mave實戰(1)——Maven介紹

[學習筆記] CS131 Computer Vision: Foundations and Applications：Lecture 1 課程介紹

大數據之ETL工具Kettle的--1功能介紹

(1).scrapy介紹

相關推薦