1. 程式人生 > >問題記錄-python寫mapper測試時出現urllib.error.HTTPError: HTTP Error 404: Not Found

問題記錄-python寫mapper測試時出現urllib.error.HTTPError: HTTP Error 404: Not Found

[email protected]:~/python/pythonfile$ cat keyword.txt
sheep	2
dog,3
firework 3
[email protected]:~/python/pythonfile$ cat keyword.txt | ./mappertest1-1.py
Traceback (most recent call last):
  File "./mappertest1-1.py", line 58, in <module>
    response = urllib.request.urlopen('https://www.bing.com/images/asvnc?q=' + urllib.parse.quote_plus(keyword) + '&async=content&first=' + str(current) + '&adlt=' + adlt)
  File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 469, in open
    response = meth(req, response)
  File "/usr/lib/python3.4/urllib/request.py", line 579, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.4/urllib/request.py", line 501, in error
    result = self._call_chain(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 684, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 469, in open
    response = meth(req, response)
  File "/usr/lib/python3.4/urllib/request.py", line 579, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.4/urllib/request.py", line 507, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 587, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

功能實現是從文字中獲取關鍵詞然後搜尋下載圖片,在直接賦字串時可以實現搜尋下載。

但mapper輸入應該是從sys.stdin按行獲取如下

for line in sys.std.in:

***************具體實現

在這樣寫入時反而出現了以上的問題,url打開出問題,在圖片下載上使用了多執行緒,目前不知是哪裡的問題

居然是網址的問題!!!

原網址是設定

response = urllib.request.urlopen('https://www.bing.com/images/asvnc?q=' + urllib.parse.quote_plus(keyword) + '&async=content&first=' + str(current) + '&adlt=' + adlt)

改後

response = urllib.request.urlopen('https://cn.bing.com/images/async?q=' + urllib.parse.quote_plus(keyword) + '&async=content&first=' + str(current) + '&adlt=' + adlt)

初步解釋:

在直接訪問時使用www.bing.com提示連接出錯然後直接強制跳轉到cn.bing.com了,不知道什麼原因。

在訪問bing的官網時也是直接訪問cn.bing.com,點選了switch to english 出現的網址是http://global.bing.com/?FORM=HPCNEN&setmkt=en-us&setlang=en-us,而不是ww.bing.com