1. 程式人生 > 其它 >nltk讀取自帶英語停用詞報錯

nltk讀取自帶英語停用詞報錯

技術標籤:pythonnltk

nltk讀取自帶英語停用詞報錯:

>>> stopwords = nltk.corpus.stopwords.words('english')
...

報錯:
Traceback (most recent call last):
File "E:\工作\pycharmProject\venv\lib\site-packages\nltk\corpus\util.py", line 80, in __load
try: root = nltk.data.find('{}/{}'.format(self.subdir, zip_name))

File "E:\工作\pycharmProject\venv\lib\site-packages\nltk\data.py", line 675, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource stopwords not found.
Please use the NLTK Downloader to obtain the resource:
import nltk
>>> nltk.download('stopwords')


Searched in:
- 'C:\\Users\\18620/nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
- 'E:\\工作\\pycharmProject\\venv\\nltk_data'
- 'E:\\工作\\pycharmProject\\venv\\share\\nltk_data'
- 'E:\\工作\\pycharmProject\\venv\\lib\\nltk_data'
- 'C:\\Users\\18620\\AppData\\Roaming\\nltk_data'
**********************************************************************
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "E:\工作\pycharmProject\venv\lib\site-packages\nltk\corpus\util.py", line 116, in __getattr__
self.__load()
File "E:\工作\pycharmProject\venv\lib\site-packages\nltk\corpus\util.py", line 81, in __load
except LookupError: raise e
File "E:\工作\pycharmProject\venv\lib\site-packages\nltk\corpus\util.py", line 78, in __load
root = nltk.data.find('{}/{}'.format(self.subdir, self.__name))
File "E:\工作\pycharmProject\venv\lib\site-packages\nltk\data.py", line 675, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource stopwords not found.
Please use the NLTK Downloader to obtain the resource:
import nltk
>>> nltk.download('stopwords')

Searched in:
- 'C:\\Users\\18620/nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
- 'E:\\工作\\pycharmProject\\venv\\nltk_data'
- 'E:\\工作\\pycharmProject\\venv\\share\\nltk_data'
- 'E:\\工作\\pycharmProject\\venv\\lib\\nltk_data'
- 'C:\\Users\\18620\\AppData\\Roaming\\nltk_data'

**********************************************************************

嘗試在p'y在p'ycharm下面命令列下載:failed

>>> nltk.download('stopwords')
[nltk_data] Error loading stopwords: <urlopen error [Errno 11004]
[nltk_data]     getaddrinfo failed>
False

解決方法:

選擇任意路徑下,比如選擇:E:\工作\pycharmProject\venv\Lib,在其下新建資料夾nltk_data\corpora\stopwords。

完整路徑:E:\工作\pycharmProject\venv\Lib\nltk_data\corpora\stopwords,接著在stopwords下新建檔案english,貼上以下內容:

i
me
my
myself
we
our
ours
ourselves
you
you're
you've
you'll
you'd
your
yours
yourself
yourselves
he
him
his
himself
she
she's
her
hers
herself
it
it's
its
itself
they
them
their
theirs
themselves
what
which
who
whom
this
that
that'll
these
those
am
is
are
was
were
be
been
being
have
has
had
having
do
does
did
doing
a
an
the
and
but
if
or
because
as
until
while
of
at
by
for
with
about
against
between
into
through
during
before
after
above
below
to
from
up
down
in
out
on
off
over
under
again
further
then
once
here
there
when
where
why
how
all
any
both
each
few
more
most
other
some
such
no
nor
not
only
own
same
so
than
too
very
s
t
can
will
just
don
don't
should
should've
now
d
ll
m
o
re
ve
y
ain
aren
aren't
couldn
couldn't
didn
didn't
doesn
doesn't
hadn
hadn't
hasn
hasn't
haven
haven't
isn
isn't
ma
mightn
mightn't
mustn
mustn't
needn
needn't
shan
shan't
shouldn
shouldn't
wasn
wasn't
weren
weren't
won
won't
wouldn
wouldn't

執行成功(或者直接上github手動下載資料,可能會耗費很長時間,當初在學校下載了很久,這裡也上傳了)