使用vectorizer.fit_transform時出現AttributeError: 'file' object has no attribute 'lower'

阿新 • • 發佈：2019-02-20

問題

最近在讀書《Building Machine Learning Systems with Python》1第一版，發現其中的一個程式碼錯誤，

AttributeError: ‘file’ object has no attribute ‘lower’

產生該錯誤的程式碼為：

import os
os.listdir('./data/toy/')
posts = [open(os.path.join('./data/toy/',f)) for f in os.listdir('./data/toy/')]

from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(min_df=1 
)

X_train= vectorizer.fit_transform(posts)
num_samples, num_features = X_train.shape
print("#samples, %d, #features, %d" %(num_samples, num_features))

解決方法

感謝網站提供的解決方法，即將vectorizer = CountVectorizer(min_df=1)改為

vectorizer = CountVectorizer(min_df=1,input="file")

即可解決上面的錯誤。

由於我使用的是Ipython notebook執行環境，在同一個cell裡面將程式碼改變了以後，重新執行，則出現了新的錯誤：

ValueError: empty vocabulary; perhaps the documents only contain stop words

嘗試了半天也沒有找到合適的解決辦法。最後，我找到了解決辦法：刪除所有含有之前程式碼的cell，新建一個cell，在裡面寫入更新的程式碼，即可解決 “empty vocabularty”錯誤。這個錯誤與程式碼本身無關，而與使用的Ipython notebook環境有關。希望大家在以後使用Ipython notebook時，注意這類的問題。

執行成功的介面為：
這裡寫圖片描述

Building Machine Learning Systems with Python. 2013. Willi Ricchert, Luis Pedro Coelho. Packt publishing.

↩

使用vectorizer.fit_transform時出現AttributeError: 'file' object has no attribute 'lower'

問題最近在讀書《Building Machine Learning Systems with Python》1第一版，發現其中的一個程式碼錯誤， AttributeError: ‘file’ object has no attribute ‘lowe

scrapy：使用response.follow()方法時出現AttributeError: 'HtmlResponse' object has no attribute 'follow'

執行scrapy出現AttributeError: ‘HtmlResponse’ object has no attribute ‘follow’ 詳細錯誤： 2017-05-20 22:58:44 [scrapy.utils.log] INFO: Sc

針對AttributeError: ‘module’ object has no attribute’xxx’的錯誤歸類

找不到 with 類型 error: 開頭 -a 發現使用 def 目前遇見的有三種類型：拼寫錯誤，模塊一定要拼寫錯誤，這個也是最容易犯的，發現找不到模塊的時候，最好先檢查一遍自己引入的模塊拼寫尤其是那些名字非常長的比如HTTPPasswordMgrWithDefau

解決xgboost異常AttributeError: 'DMatrix' object has no attribute 'handle'

xgboost異常AttributeError: 'DMatrix' object has no attribute 'handle' sys:1: DtypeWarning: Columns (6

AttributeError：object has no attribute 報錯及解決

報錯情況：在前端測試中，介面傳送一個 PUT 請求，介面發生上面的報錯，請求無法響應，伺服器狀態碼是500。錯誤分析：語義上是“物件沒有一個XXX屬性”。查閱大部分資料，大部分說 Python 的問題。這個專案前端使用 React，後端使用 Python 的 djongo

python AttributeError: 'module' object has no attribute 'setdefaultencoding'

window下使用python，AttributeError: 'module' object has no attribute 'setdefaultencoding'問題的解決方法參閱了http://www.jb51.net/article/54159.htm後進行整

AttributeError: 'module' object has no attribute 'post'問題

在學習python網頁爬蟲時，測試requests的post方法進行表單提交時，命名.py檔案為requests.py 程式碼如下： import requests params={'firstname':'xing','lastname':'ming'

django學習記錄-- 新增haystack搜尋框架時報錯：AttributeError: 'BlogIndex' object has no attribute 'fields'

懷著十分悲痛的心情寫下這篇部落格... 想要新增一個搜尋功能，按照網上的教程使用haystack+whoosh+jieba 一步一步到了最後建立索引的部分，這裡有兩個方法，rebuild_index 和 update_index。第一次用rebuild_index，報

Python指令碼報錯AttributeError: ‘module’ object has no attribute’xxx’解決方法

最近在編寫Python指令碼過程中遇到一個問題比較奇怪：Python指令碼完全正常沒問題，但執行總報錯"AttributeError: 'module' object has no attribute 'xxx'"。這其實是.pyc檔案存在問題。問題定位：

Python3.X出現AttributeError: module 'urllib' has no attribute 'urlopen'錯誤

研究用Python寫爬蟲，下載一個網頁。報錯程式碼如下 import urllib def getHtml(url): page = urllib.urlopen(url) html = page.read() return html html

AttributeError: 'module' object has no attribute 'urlopen'

編譯環境：python 3.1.2 測試程式： # coding = utf-8 import urllib def getHtml(url): page = urllib.

[python錯誤]builtins.AttributeError: 'module' object has no attribute 'request'

在python3.4.3版本下，使用wingIDE寫爬蟲的時候，發生了builtins.AttributeError: 'module' object has no attribute 'request'的錯誤。錯誤來源是程式碼中的 <span style="fo

報錯 AttributeError: 'module' object has no attribute 'bool_'

在使用import numpy時突然出現如下報錯：出現AttributeError: ‘module’ object has no attribute ‘bool_’報錯. 解決辦法： **因為昨晚安裝caffe，但make報錯就沒有繼續，電腦裡已安

AttributeError: 'NoneType' object has no attribute 'sc' 解決方法（二）

上一次本以為可以解決了這個問題，然而並沒有那麼地簡單。博主最近在edx網站學習pyspark，想打一下視訊上的程式碼，結果報錯了，依舊是報了“AttributeError:’NoneType’ object has no attribute ‘sc’”，當時就有

AttributeError: 'module' object has no attribute 'select'

使用tf.select函數出現錯誤, AttributeError: ‘module’ object has no attribute ‘select’ 這是因為當前版本tensorflow無tf.select函式,可以改為使用函式tf.where

一處筆誤導致AttributeError: 'bool' object has no attribute 'call'

不應該叫做錯誤吧，文中寫錯了，但是給的demo中沒有寫錯．對於我這個菜鳥為了找到這個問題，耗費了半天時間．．．．．文中第８章，示例8-10及其後面關於判斷使用者登陸的current_user.is_authenticated()，文中寫法是{ % if current_

Python 3.x中使用urllib出現AttributeError: module 'urllib' has no attribute 'request'錯誤

剛剛開始學習爬蟲，開始寫部落格打算把錯誤記錄下來，已杜自己忘記，並給同樣的小白幫助python 3.x中urllib庫和urilib2庫合併成了urllib庫，python3.X中應該使用urllib.request，即替換掉（python中的）urllib2成urllib.

AttributeError: 'Node' object has no attribute 'output_masks'

錯誤: Traceback (most recent call last): File "/home/nianxiongdi/algorithm/deform-conv/scripts/scaled_mnist1.py", line 94, in <module&

pycharm 安裝第三方庫報錯：AttributeError: 'module' object has no attribute 'main'

pip升級到 10.0.1 之後老版的pycharm 使用pip安裝第三方庫的時候會報錯，報錯如上圖所示：其主要原因是新版的 pip 更改了部分api 將其中 pip.main() 改為 pip_main()，舊版的pycharm中在packagi

Python報錯之：AttributeError: 'NoneType' object has no attribute 'seq'

在用matplotlib進行資料視覺化、給圖形新增標籤資料標籤時遇到一個報錯：AttributeError: 'NoneType' object has no attribute 'seq'. 記錄一

使用vectorizer.fit_transform時出現AttributeError: 'file' object has no attribute 'lower'

問題

解決方法

相關推薦