1. 程式人生 > 實用技巧 >題解「AGC048B Bracket Score」

題解「AGC048B Bracket Score」

import jieba
excludes = {"什麼","一個","我們","那裡","你們","如今","說道","知道","起來","姑娘","這裡","出來","他們","眾人","自己",
            "一面","只見","怎麼","兩個","沒有","不是","不知","這個","聽見","這樣","進來","咱們","告訴","就是",
            "東西","襲人","回來","只是","大家","只得","老爺","丫頭","這些","不敢","出去","所以","不過","的話","不好",
            "姐姐","探春","鴛鴦","一時","不能","過來","心裡","如此","今日","銀子","幾個","答應","二人","還有","只管",
            "這麼","說話","一回","那邊","這話","外頭","打發","自然","今兒","罷了","屋裡","那些","聽說","小丫頭","不用","如何"}

txt = open("紅樓夢.txt","r",encoding='utf-8').read()
'''
不寫明路徑的話,預設和儲存的python檔案在同一目錄下 注意開啟格式是utf-8,這個可以開啟txt檔案,選擇另存為,注意介面右下角的格式
'''
words = jieba.lcut(txt)
'''
利用jieba庫將紅樓夢的所有語句分成詞彙
'''
counts = {}
'''
建立的一個空的字典
'''
for word in words:
    if len(word) == 1:      #如果長度是一,可能是語氣詞之類的,應該刪除掉
        continue
    else:
        counts[word] = counts.get(word,0) + 1
'''
    如果字典中沒有這個健(名字)則建立,如果有這個健那麼就給他的計數加一
    [姓名:數量],這裡是數量加一
'''
for word in excludes:
    del(counts[word
'''
    #這一步:如果列出的干擾詞彙在分完詞後的所有詞彙中那麼刪除
'''
items = list(counts.items())
'''
把儲存[姓名:個數]的字典轉換成列表
'''
items.sort(key=lambda x:x[1],reverse = True)
'''
對上述列表進行排序,'True'是降序排列
'''
for i in range(20):
    word,count = items[i]
    print("{0:<10}{1:>5}".format(word,count))