問題記錄-python寫mapper測試時出現urllib.error.HTTPError: HTTP Error 404: Not Found
阿新 • • 發佈:2018-12-31
[email protected]:~/python/pythonfile$ cat keyword.txt sheep 2 dog,3 firework 3 [email protected]:~/python/pythonfile$ cat keyword.txt | ./mappertest1-1.py Traceback (most recent call last): File "./mappertest1-1.py", line 58, in <module> response = urllib.request.urlopen('https://www.bing.com/images/asvnc?q=' + urllib.parse.quote_plus(keyword) + '&async=content&first=' + str(current) + '&adlt=' + adlt) File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.4/urllib/request.py", line 469, in open response = meth(req, response) File "/usr/lib/python3.4/urllib/request.py", line 579, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python3.4/urllib/request.py", line 501, in error result = self._call_chain(*args) File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain result = func(*args) File "/usr/lib/python3.4/urllib/request.py", line 684, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "/usr/lib/python3.4/urllib/request.py", line 469, in open response = meth(req, response) File "/usr/lib/python3.4/urllib/request.py", line 579, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python3.4/urllib/request.py", line 507, in error return self._call_chain(*args) File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain result = func(*args) File "/usr/lib/python3.4/urllib/request.py", line 587, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found
功能實現是從文字中獲取關鍵詞然後搜尋下載圖片,在直接賦字串時可以實現搜尋下載。
但mapper輸入應該是從sys.stdin按行獲取如下
for line in sys.std.in:
***************具體實現
在這樣寫入時反而出現了以上的問題,url打開出問題,在圖片下載上使用了多執行緒,目前不知是哪裡的問題
居然是網址的問題!!!
原網址是設定
response = urllib.request.urlopen('https://www.bing.com/images/asvnc?q=' + urllib.parse.quote_plus(keyword) + '&async=content&first=' + str(current) + '&adlt=' + adlt)
改後
response = urllib.request.urlopen('https://cn.bing.com/images/async?q=' + urllib.parse.quote_plus(keyword) + '&async=content&first=' + str(current) + '&adlt=' + adlt)
初步解釋:
在直接訪問時使用www.bing.com提示連接出錯然後直接強制跳轉到cn.bing.com了,不知道什麼原因。
在訪問bing的官網時也是直接訪問cn.bing.com,點選了switch to english 出現的網址是http://global.bing.com/?FORM=HPCNEN&setmkt=en-us&setlang=en-us,而不是ww.bing.com