Python 獲得漢字筆畫

阿新 • • 發佈：2018-05-06

strip() 通過 star 原理如果實現包含 urn with

通過unihan的文件來實現。
只要是unihan中有kTotalStrokes字段，獲取起筆畫數。
Hash也是非常簡單清楚的，但想到這些unicode其實會有一個分布規律，就記錄了一下，
利用此性質通過數組方式來獲取筆畫。

記錄了一下unicode的範圍
start: [13311, 19968, 63744, 131072, 173824, 177984, 178208, 194995]
end : [19893, 40917, 64045, 173782, 177972, 178205, 183969, 194998]

總共包括80682個存在筆畫數的unicode碼，包含CJKV。
64045-131072中間都沒有此字段，就分了兩部分。

此處使用python3 Demo實現，原理非常簡單：使用數組保持筆畫，將unicode映射到數組index，即可獲取對應筆畫數

def get_stroke(c):
    # 如果返回 0, 則也是在unicode中不存在kTotalStrokes字段
    strokes = []
    with open(strokes_path, ‘r‘) as fr:
        for line in fr:
            strokes.append(int(line.strip()))

    unicode_ = ord(c)

    if 13312 <= unicode_ <= 64045:
        return strokes[unicode_-13312]
    elif 131072 <= unicode_ <= 194998:
        return strokes[unicode_-80338]
    else:
        print("c should be a CJK char, or not have stroke in unihan data.")
        # can also return 0

strokes_path: https://github.com/helmz/Corpus/blob/master/zh_dict/strokes.txt
"按照unicode順序排列的筆畫數"

Python 獲得漢字筆畫

strip() 通過 star 原理如果實現包含 urn with 通過unihan的文件來實現。只要是unihan中有kTotalStrokes字段，獲取起筆畫數。 Hash也是非常簡單清楚的，但想到這些unicode其實會有一個分布規律，就記錄了一下，利用此性

Python 獲得漢字筆畫

Python 獲得漢字筆畫

python 獲得列表中每個元素出現次數的最快方法

python獲得類名和函數名的方法

200.1 Python實現一筆畫完輔助

Python之漢字轉拼音

python獲得頁面跳轉的最終URL

獲得漢字的拼音或者拼音簡寫

Python 判斷漢字

關於在Python當中漢字日期的轉換

【Python】使用python實現漢字轉拼音（2018.12更新）

Python獲得一篇文件的不重複詞列表

Python 獲得13位unix時間戳

教你使用python獲得字串的md5值

MFC獲得漢字拼音首個字母-C++版

python中漢字匹配

python 將漢字轉換為拼音

取得漢字筆畫數(WORD 版) (VBA)

python獲得變數的名稱，獲得傳參（形參和實參）的名稱

Python 獲得命令列引數的方法

python list 漢字亂碼

Python 獲得漢字筆畫

相關推薦