Scrapy 使用筆記之twisted語法錯誤問題
第一次使用Scrapy就遇到了一個twisted語法錯誤問題,按理說其實不應該這樣,下面記錄詳細的過程和解決方法
環境:windows10+Python3.7
1、建立scrapy專案
D:\PythonWorkerspace>scrapy startproject xdb New Scrapy project 'xdb', using template directory 'c:\\python3.7\\lib\\site-packages\\scrapy\\templates\\project', created in: D:\PythonWorkerspace\xdb
You can start your first spider with: cd xdb scrapy genspider example example.com
2、建立spider
D:\PythonWorkerspace>cd xdb
D:\PythonWorkerspace\xdb>dir 驅動器 D 中的卷是 新加捲 卷的序列號是 90B7-EFCF
D:\PythonWorkerspace\xdb 的目錄
2018/10/13 06:29 <DIR> . 2018/10/13 06:29 <DIR> .. 2018/10/13 06:29 249 scrapy.cfg 2018/10/13 06:29 <DIR> xdb 1 個檔案 249 位元組 3 個目錄 16,750,272,512 可用位元組
D:\PythonWorkerspace\xdb>scrapy genspider chouti dig.chouti.com Created spider 'chouti' using template 'basic' in module: xdb.spiders.chouti
D:\PythonWorkerspace\xdb>scrapy genspider cnblogs cnblogs.com Created spider 'cnblogs' using template 'basic' in module: xdb.spiders.cnblogs
3、編寫爬蟲程式碼
因為是第一次使用爬蟲,所以在這裡的parse方法裡只寫了一行程式碼:
print('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')
3、啟動spider
在專案根目錄下輸入scrapy crawl chouti啟動這個爬蟲
D:\PythonWorkerspace\xdb>scrapy crawl chouti 2018-10-13 07:13:46 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: xdb) 2018-10-13 07:13:46 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0 , Twisted 18.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 18.0.0 (OpenSS L 1.1.0i 14 Aug 2018), cryptography 2.3.1, Platform Windows-10-10.0.17134-SP0 2018-10-13 07:13:46 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'xdb', 'NEWSPIDER_MODULE': 'xdb.spiders', 'ROBOTST XT_OBEY': True, 'SPIDER_MODULES': ['xdb.spiders']} Traceback (most recent call last): File "c:\python3.7\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "c:\python3.7\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Python3.7\Scripts\scrapy.exe\__main__.py", line 9, in <module> File "c:\python3.7\lib\site-packages\scrapy\cmdline.py", line 150, in execute _run_print_help(parser, _run_command, cmd, args, opts) File "c:\python3.7\lib\site-packages\scrapy\cmdline.py", line 90, in _run_print_help func(*a, **kw) File "c:\python3.7\lib\site-packages\scrapy\cmdline.py", line 157, in _run_command cmd.run(args, opts) File "c:\python3.7\lib\site-packages\scrapy\commands\crawl.py", line 57, in run self.crawler_process.crawl(spname, **opts.spargs) File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 170, in crawl crawler = self.create_crawler(crawler_or_spidercls) File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 198, in create_crawler return self._create_crawler(crawler_or_spidercls) File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 203, in _create_crawler return Crawler(spidercls, self.settings) File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 55, in __init__ self.extensions = ExtensionManager.from_crawler(self) File "c:\python3.7\lib\site-packages\scrapy\middleware.py", line 58, in from_crawler return cls.from_settings(crawler.settings, crawler) File "c:\python3.7\lib\site-packages\scrapy\middleware.py", line 34, in from_settings mwcls = load_object(clspath) File "c:\python3.7\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object mod = import_module(module) File "c:\python3.7\lib\importlib\__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1006, in _gcd_import File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 677, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 728, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "c:\python3.7\lib\site-packages\scrapy\extensions\telnet.py", line 12, in <module> from twisted.conch import manhole, telnet File "c:\python3.7\lib\site-packages\twisted\conch\manhole.py", line 154 def write(self, data, async=False): ^ SyntaxError: invalid syntax
D:\PythonWorkerspace\xdb> 居然這裡丟擲了異常:SyntaxError: invalid syntax----語法錯誤,
在Pycharm 裡面開啟 c:\python3.7\lib\site-packages\twisted\conch\manhole.py (按Ctrl鍵+滑鼠左鍵)開啟後是這樣:
在154行這裡提示:formal parameter name expected,查詢發現self.handler.addOutput()裡面也是用的同樣的形參。
其實這裡的async是python裡面的關鍵字,使用關鍵字作為形參名稱(本質上還是變數)自然是非法的。 明白了錯誤的原因後解決問題就很方便了,修改一下這個形參的名稱,例如我修改成_ansyc,然後儲存檔案
3、驗證修改結果
重啟啟動這個爬蟲,執行正常。