課堂練習(詞頻統計)
阿新 • • 發佈:2017-09-25
gen load ping generate int cut 就業 matplot prefix
希望曾老師講的內容
沒有什麽意見,希望可以講一下大數據的就業前景,就業的薪資待遇。
小說詞頻統計
import jieba book = "F:\最強升級系統.txt" txt = open(book,"r",encoding=‘GBK‘).read() ex = {‘神仙‘,‘系統‘,‘狂暴‘,‘玩家‘,‘提示‘,‘龍飛‘} ls = [] words = jieba.lcut(txt) counts = {} for word in words: ls.append(word) if len(word) == 1: continue else: counts[word] = counts.get(word,0)+1 for word in ex: del(counts[word]) items = list(counts.items()) items.sort(key = lambda x:x[1], reverse = True) for i in range(10): word , count = items[i] print ("{:<10}{:>5}".format(word,count)) lk = open(‘lk.txt‘,‘w+‘) lk.write(str(ls)) import matplotlib.pyplot as plt from wordcloud import WordCloud wzhz = WordCloud().generate(txt) plt.imshow(wzhz) plt.show()
================ RESTART: C:/Users/Administrator/Desktop/1.py ================ Building prefix dict from the default dictionary ... Dumping model to file cache C:\Users\ADMINI~1\AppData\Local\Temp\jieba.cache Loading model cost 0.814 seconds. Prefix dict has been built succesfully. 沒有 41 喬喬 31 恭喜 26 戰士 23 李三 22 修煉 21 一個 20 廢物 18 蛤蟆功 18 妖獸 17
課堂練習(詞頻統計)