1. 程式人生 > >Scrapy 使用筆記之twisted語法錯誤問題

Scrapy 使用筆記之twisted語法錯誤問題

第一次使用Scrapy就遇到了一個twisted語法錯誤問題,按理說其實不應該這樣,下面記錄詳細的過程和解決方法

環境:windows10+Python3.7

1、建立scrapy專案

D:\PythonWorkerspace>scrapy startproject xdb New Scrapy project 'xdb', using template directory 'c:\\python3.7\\lib\\site-packages\\scrapy\\templates\\project', created in:     D:\PythonWorkerspace\xdb

You can start your first spider with:     cd xdb     scrapy genspider example example.com  

2、建立spider

D:\PythonWorkerspace>cd xdb

D:\PythonWorkerspace\xdb>dir  驅動器 D 中的卷是 新加捲  卷的序列號是 90B7-EFCF

 D:\PythonWorkerspace\xdb 的目錄

2018/10/13  06:29    <DIR>          . 2018/10/13  06:29    <DIR>          .. 2018/10/13  06:29               249 scrapy.cfg 2018/10/13  06:29    <DIR>          xdb                1 個檔案            249 位元組                3 個目錄 16,750,272,512 可用位元組

D:\PythonWorkerspace\xdb>scrapy genspider chouti  dig.chouti.com Created spider 'chouti' using template 'basic' in module:   xdb.spiders.chouti

D:\PythonWorkerspace\xdb>scrapy genspider cnblogs cnblogs.com Created spider 'cnblogs' using template 'basic' in module:   xdb.spiders.cnblogs

3、編寫爬蟲程式碼

因為是第一次使用爬蟲,所以在這裡的parse方法裡只寫了一行程式碼:  

print('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')

3、啟動spider

在專案根目錄下輸入scrapy crawl chouti啟動這個爬蟲

D:\PythonWorkerspace\xdb>scrapy crawl chouti 2018-10-13 07:13:46 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: xdb) 2018-10-13 07:13:46 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0 , Twisted 18.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 18.0.0 (OpenSS L 1.1.0i  14 Aug 2018), cryptography 2.3.1, Platform Windows-10-10.0.17134-SP0 2018-10-13 07:13:46 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'xdb', 'NEWSPIDER_MODULE': 'xdb.spiders', 'ROBOTST XT_OBEY': True, 'SPIDER_MODULES': ['xdb.spiders']} Traceback (most recent call last):   File "c:\python3.7\lib\runpy.py", line 193, in _run_module_as_main     "__main__", mod_spec)   File "c:\python3.7\lib\runpy.py", line 85, in _run_code     exec(code, run_globals)   File "C:\Python3.7\Scripts\scrapy.exe\__main__.py", line 9, in <module>   File "c:\python3.7\lib\site-packages\scrapy\cmdline.py", line 150, in execute     _run_print_help(parser, _run_command, cmd, args, opts)   File "c:\python3.7\lib\site-packages\scrapy\cmdline.py", line 90, in _run_print_help     func(*a, **kw)   File "c:\python3.7\lib\site-packages\scrapy\cmdline.py", line 157, in _run_command     cmd.run(args, opts)   File "c:\python3.7\lib\site-packages\scrapy\commands\crawl.py", line 57, in run     self.crawler_process.crawl(spname, **opts.spargs)   File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 170, in crawl     crawler = self.create_crawler(crawler_or_spidercls)   File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 198, in create_crawler     return self._create_crawler(crawler_or_spidercls)   File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 203, in _create_crawler     return Crawler(spidercls, self.settings)   File "c:\python3.7\lib\site-packages\scrapy\crawler.py", line 55, in __init__     self.extensions = ExtensionManager.from_crawler(self)   File "c:\python3.7\lib\site-packages\scrapy\middleware.py", line 58, in from_crawler     return cls.from_settings(crawler.settings, crawler)   File "c:\python3.7\lib\site-packages\scrapy\middleware.py", line 34, in from_settings     mwcls = load_object(clspath)   File "c:\python3.7\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object     mod = import_module(module)   File "c:\python3.7\lib\importlib\__init__.py", line 127, in import_module     return _bootstrap._gcd_import(name[level:], package, level)   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import   File "<frozen importlib._bootstrap>", line 983, in _find_and_load   File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked   File "<frozen importlib._bootstrap>", line 677, in _load_unlocked   File "<frozen importlib._bootstrap_external>", line 728, in exec_module   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed   File "c:\python3.7\lib\site-packages\scrapy\extensions\telnet.py", line 12, in <module>     from twisted.conch import manhole, telnet   File "c:\python3.7\lib\site-packages\twisted\conch\manhole.py", line 154     def write(self, data, async=False):                               ^ SyntaxError: invalid syntax

D:\PythonWorkerspace\xdb> 居然這裡丟擲了異常:SyntaxError: invalid syntax----語法錯誤,

在Pycharm 裡面開啟 c:\python3.7\lib\site-packages\twisted\conch\manhole.py (按Ctrl鍵+滑鼠左鍵)開啟後是這樣:

在154行這裡提示:formal parameter name expected,查詢發現self.handler.addOutput()裡面也是用的同樣的形參。

其實這裡的async是python裡面的關鍵字,使用關鍵字作為形參名稱(本質上還是變數)自然是非法的。 明白了錯誤的原因後解決問題就很方便了,修改一下這個形參的名稱,例如我修改成_ansyc,然後儲存檔案

3、驗證修改結果

重啟啟動這個爬蟲,執行正常。