scrapy 自定義擴充套件
阿新 • • 發佈:2018-11-10
1、新建一個擴充套件檔案,定義一個類,必須包含from_crawler方法:
from scrapy import signals class MyExtend: def __init__(self, crawler): self.crawler = crawler # 給鉤子掛操作 crawler.signals.connect(self.start, signals.engine_started) @classmethod def from_crawler(cls, crawler):return cls(crawler) def start(self): # 自定義操作 print('signals.engine_started')
2、設定settings
EXTENSIONS = { 'day96.extensions.MyExtend': 300, }
3、可以掛鉤子的地方
# 引擎開始執行的時候 engine_started = object() # 引擎結束執行的時候 engine_stopped = object() spider_opened = object() spider_idle= object() spider_closed = object() spider_error = object() request_scheduled = object() request_dropped = object() response_received = object() response_downloaded = object() # yield Item的時候 item_scraped = object() # Item丟棄的時候 item_dropped = object()