Ansj新增停用詞表
阿新 • • 發佈:2019-02-05
StopWordTable.txt為中文通用詞列表,去網上可以搜尋到。strHashMap為生成號的停用詞詞典。在呼叫 FilterModifWord.modifResult()方法就可以實現去掉停用詞。HashMap<String, String> strHashMap = new HashMap<String, String>(); String stopWordTable = "StopWordTable.txt"; File f = new File(stopWordTable); FileInputStream fileInputStream = new FileInputStream(f); //讀入停用詞檔案 BufferedReader StopWordFileBr = new BufferedReader(new InputStreamReader(fileInputStream)); String stopWord = null; for(; (stopWord = StopWordFileBr.readLine()) != null;){ strHashMap.put(stopWord , "_stop"); } StopWordFileBr.close(); FilterModifWord.setUpdateDic(strHashMap);