1. 程式人生 > >pyspider部署以及遇到的問題(on centos7 with python3.5)

pyspider部署以及遇到的問題(on centos7 with python3.5)

我是在自己的vps(centos7)上部署的,使用了virtualenv,使用的python版本為3.5.2
注意編譯環境一定要裝好。
關於centos7安裝python3.5,啟用virtualenv以及必須的編譯環境,請看這裡

部署

# 新建虛擬環境並進入
>>>virtualenv -p /usr/bin/python3 ~/envs/testenv
>>>source ~/envs/testenv/bin/activate

# 安裝pycurl(安裝pyspider時就會自動安裝,但是自動安裝的在我這兒出錯)
>>>export PYCURL_SSL_LIBRARY=nss
>>>pip install pycurl --no-cache-dir

# 安裝pyspider
>>>pip install pyspider # 執行 pyspider

遇到的問題

pip install pyspider 執行後提示curl安裝不上

是因為編譯環境沒有裝好,yum安裝libcurl-devel即可

安裝好pyspider成功,執行pyspider命令時報錯

RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment.  Either run this under Python 2
or consult http://click.pocoo.org/python3/ for mitigation steps.

You are dealing with an environment where Python 3 thinks you are restricted to ASCII data. The solution to these problems is different depending on which locale your computer is running in.
If you are on a US machine, en_US.utf-8 is the encoding of choice. On some newer Linux systems, you could also try C.UTF-8 as the locale:
export LC_ALL=C.UTF-8
export LANG=C.UTF-8

執行

export LC_ALL=en_US.utf-8
export LANG=en_US.utf-8

執行pyspider命令時報錯

libcurl link-time ssl backend (nss) is different from compile-time ssl backend (none/other)

解除安裝pycurl,按照之前的說明,重新安裝。

pip uninstall pycurl
export PYCURL_SSL_LIBRARY=[nss|openssl|ssl|gnutls]
pip install pycurl --no-cache-dir

執行pyspider命令時提示ImportError: No module named ‘_sqlite3’

這是python3沒有編譯好。見這裡

執行後效果

[W 160907 13:37:27 run:403] phantomjs not found, continue running without it.
[I 160907 13:37:29 result_worker:49] result_worker starting...
[I 160907 13:37:30 processor:208] processor starting...
[I 160907 13:37:30 scheduler:569] scheduler starting...
[I 160907 13:37:30 scheduler:508] in 5m: new:0,success:0,retry:0,failed:0
[I 160907 13:37:30 tornado_fetcher:508] fetcher starting...
/root/envs/hcomic/lib/python3.5/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.login is deprecated, use flask_login instead.
  .format(x=modname), ExtDeprecationWarning
[I 160907 13:37:30 scheduler:683] scheduler.xmlrpc listening on 127.0.0.1:23333
[W 160907 13:37:31 app:61] WebDav interface not enabled: ImportError("No module named 'wsgidav'",)
[I 160907 13:37:31 app:76] webui running on 0.0.0.0:5000

此時開放vps的5000埠,就可以通過瀏覽器訪問pyspider的webui了

pyspider預設沒有phantomjs和wsgidav。
需要自己配置開啟。
預設使用sqlite資料庫,會在執行pyspider命令的目錄下生成data目錄。