[NLP技術]關鍵詞提取演算法實現
阿新 • • 發佈:2018-12-27
實現程式碼:
var nodejieba = require("nodejieba");
var fs = require('fs');
var topN = 100;
var result;
var data = fs.readFileSync('t.txt', 'utf8');
console.log(data);
result = nodejieba.extract(data, topN);
console.log("11==>",result);
t.txt
據中國之聲《新聞縱橫》報道,在剛剛過去的中秋之夜,一顆“火流星”滑亮了雲南省迪慶州的夜空。根據相關天文機構公佈的資訊,隕石墜落的地點,可能位於香格里拉市的巴拉格宗景區範圍內。 事發一週之後,昨天(11日)下午,記者專訪了巴拉格宗景區相關人員。對方稱,目前還是沒有確定隕石墜落的具體位置。最近,有很多人員都在當地尋找隕石,但至今沒有任何訊息。雖然隕石還沒有找到,但在網上有關隕石歸屬的問題已經引發了討論。 巴拉格宗景區的工作人員洛桑培楚說,事發當時,景區的多位工作人員都目睹了那顆“火流星”,“因為我們酒店的位置,剛好是在一個U字型的峽谷裡,感覺突然間天空特別亮,有個東西就飛過來了,打在對面的崖壁上,過了幾分鐘之後,就聽見咚的一聲,附近村民有明顯的震感。”
實現效果:
liuyugang:NodeJieBa apple$ node nodenlp.js
....
11==> [ { word: '隕石', weight: 45.6077707943 },
{ word: '格宗', weight: 35.21761292125063 },
{ word: '景區', weight: 32.27518069876 },
{ word: '巴拉', weight: 29.735080816230003 },
{ word: '火流星', weight: 24.582479479 },
{ word: '墜落', weight: 18.22637181838 },
{ word: '事發', weight: 16.80701885336 },
{ word: '工作人員', weight: 13.28734988976 },
{ word: '震感', weight: 12.5143832909 },
{ word: '迪慶', weight: 11.9547675029 },
{ word: '11', weight: 11.739204307083542 },
{ word: '培楚', weight: 11.739204307083542 },
{ word: '有個', weight: 11.739204307083542 },
{ word : '人員', weight: 11.18200151198 },
{ word: '新聞縱橫', weight: 11.0103058941 },
{ word: '具體位置', weight: 10.8096351986 },
{ word: '飛過來', weight: 10.765183436 },
{ word: '香格里拉', weight: 10.642581114 },
{ word: '洛桑', weight: 10.2630914922 },
{ word: '字型', weight: 10.0088573539 },
{ word: '相關', weight: 9.67141986604 },
{ word: '崖壁', weight: 9.65218240993 },
{ word: '沒有', weight: 9.338470695449999 },
{ word: '目睹', weight: 8.79473217808 },
{ word: '之後', weight: 8.7536825453 },
{ word: '夜空', weight: 8.75318317516 },
{ word: '之夜', weight: 8.65893063692 },
{ word: '中秋', weight: 8.55357012126 },
{ word: '那顆', weight: 8.5488195185 },
{ word: '幾分鐘', weight: 8.4980002701 },
{ word: '專訪', weight: 8.35941410682 },
{ word: '多位', weight: 8.01735526349 },
{ word: '雲南省', weight: 8.00903344015 },
{ word: '歸屬', weight: 8.00078029839 },
{ word: '剛好', weight: 7.90174109003 },
{ word: '之聲', weight: 7.58531965045 },
{ word: '天文', weight: 7.45973111134 },
{ word: '峽谷', weight: 7.41757030052 },
{ word: '村民', weight: 7.28595205177 },
{ word: '酒店', weight: 7.19748953873 },
{ word: '對面', weight: 7.13679274341 },
{ word: '天空', weight: 6.90491149567 },
{ word: '一顆', weight: 6.84364067028 },
{ word: '地點', weight: 6.68250081357 },
{ word: '一週', weight: 6.6090214428 },
{ word: '討論', weight: 6.28144423575 },
{ word: '引發', weight: 6.18600017817 },
{ word: '網上', weight: 6.15610784262 },
{ word: '尋找', weight: 6.04010686644 },
{ word: '下午', weight: 5.96939289045 },
{ word: '昨天', weight: 5.92683327603 },
{ word: '聽見', weight: 5.92339566522 },
{ word: '報道', weight: 5.88040717916 },
{ word: '剛剛', weight: 5.78366356424 },
{ word: '最近', weight: 5.76738379075 },
{ word: '位置', weight: 5.67463922249 },
{ word: '找到', weight: 5.66161232021 },
{ word: '感覺', weight: 5.64147828931 },
{ word: '確定', weight: 5.35063012369 },
{ word: '資訊', weight: 5.25386069277 },
{ word: '範圍', weight: 5.19468393767 },
{ word: '附近', weight: 5.16934129144 },
{ word: '一聲', weight: 5.15269025031 },
{ word: '公佈', weight: 5.06198083963 },
{ word: '訊息', weight: 5.03989475617 },
{ word: '突然', weight: 4.99713421631 },
{ word: '位於', weight: 4.96609078159 },
{ word: '很多', weight: 4.85828267085 },
{ word: '東西', weight: 4.77328420082 },
{ word: '過去', weight: 4.75519585235 },
{ word: '特別', weight: 4.74775455087 },
{ word: '當時', weight: 4.67584283385 },
{ word: '機構', weight: 4.65227107919 },
{ word: '明顯', weight: 4.63964416568 },
{ word: '記者', weight: 4.29694475313 },
{ word: '問題', weight: 3.96351357308 },
{ word: '目前', weight: 3.91528758382 },
{ word: '可能', weight: 3.74802798573 },
{ word: '已經', weight: 3.42054864564 },
{ word: '中國', weight: 3.02732068666 },
{ word: '一個', weight: 2.81755097213 } ]
liuyugang:NodeJieBa apple$