AlbertS Home of Technology

阿新 • • 發佈：2019-01-08

前言

近幾天在做多語言版本的時候再次發現，區分各種語言真的是一件比較困難的事情，上一次做中文提取工具的就花了不少時間，這次決定用python試一試，結果寫起來發現真是方便不少，自己整理了一下方便以後查詢使用。

程式碼

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# find the line of containing chinese in files

__author__ = 'AlbertS'

import re

def start_find_chinese():
    find_count = 0;
    with open('ko_untranslated.txt' 
, 'wb') as outfile:
        with open('source_ko.txt', 'rb') as infile:
            while True:
                content = infile.readline()
                if re.match(r'(.*[\u4E00-\u9FA5]+)|([\u4E00-\u9FA5]+.*)', content.decode('utf-8')):
                    outfile.write(content)
                    find_count += 1 
;

                if not content:
                    return find_count

# start to find
if __name__ == '__main__':
    count = start_find_chinese()
    print("find complete! count =", count)

原始檔案

source_ko.txt檔案內容

3   캐릭터 Lv.50 달성
8   캐릭터 Lv.80 달성
10  캐릭터 Lv.90 달성
...
...
2840    飛黃騰達
4841    同歸於盡
8848 
    캐릭터 Lv.50 달

執行效果(ko_untranslated.txt檔案)

2840    飛黃騰達
4841    同歸於盡

總結

其實這段小小的程式碼中包含了兩個常用的功能，那就是讀寫檔案和正則表示式。
這也是兩個重要的知識點，其中with操作可能防止資源洩漏，操作起來更加方便。
正則表示式可是一個文書處理的利器，程式碼中的正則可能還不太完善，後續我會繼續補充更新。

AlbertS Home of Technology

前言近幾天在做多語言版本的時候再次發現，區分各種語言真的是一件比較困難的事情，上一次做中文提取工具的就花了不少時間，這次決定用python試一試，結果寫起來發現真是方便不少，自己整理了一下方便以後查

Home Of Java------ChenDong

近期開發過程中遇到了跨域的問題，趁著空暇時間，做一下消化整理，希望能夠掌握這方面的知識，也希望能得到大眾的指點。在理解跨域之前，要弄懂一點，究竟什麼會導致跨域，以前認為，在一個域名的系統中去訪問另一個域名的系統就是跨域，這是錯誤的，跨域不僅

Power of technology will free your body and spirit

問題： F:\gitserver\oschina\xiaobm\tmpjob\OMS>python manage.py makemigrations CommandError: Conflicting migrations detected; multiple lea

Asian Institute of Technology | AITopics

A talk on "AI and our Future Society" will be delivered by Dr. Qiang Yang on 17 October 2018 at the Asian Institute of Technology (AIT) Conference Center.

The role of technology against global warming

The role of technology against global warmingPresident Donald Trump’s recent nod to large-scale coal use has raised doubts about the success of efforts to

Ask HN: What's a good “History of Technology” book

I have read a few tangential books- The Information - What does technology want - Robot Mere machine to transcend mind - Zeroand a bunch of other books in

Massachusetts Institute of Technology

Want regular updates on MIT’s high-impact research, interesting people, and distinctive culture? Subscribe to join us in building a better wor

British Columbia Institute of Technology

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

無軒居 (Home of vip)

[原創]我是否仍然愛著你本文作者是杜雪，於2002.05.24 17:52:48 發表於心情故事　　序言：也不知道什麼時候自己變得如此陌生了，或許是這個世界，也或許是我自己程式一般的人生。我總是遵循著自己的人生程式去走。也不曾體會過這樣的人世間有多少苦難也悲傷

Home Of HappyBear

漏洞發掘技術主要包括：1黑盒自動測試2靜態自動分析3補丁比較4動態除錯汙點檢查在這個領域，人的經驗還是最重要的A適當的漏洞資料來源--〉B知識發現或者資料探勘--〉C漏洞預測模型--〉D漏洞自動發掘模型A漏洞樣本一定要合適C漏洞的共性，軟體的什莫部分最有可能存在漏洞？軟體在什

解決Cannot find config.m4 Make sure that you run '/home/php/bin/phpize' in the top level source directory of the module

編譯安裝 -o rec 找不到 extension home ring ins 擴展 oot@DK:/home/daokr/downfile/php-7.0.0/ext/mysqlnd# /home/php/bin/phpizeCannot find config.m4.

Thoughts on the Application of Radar Technology to the Improvement of Street Light System

Abstract: Street lights are everywhere in people’s daily lives. But street lights can sometimes cause problems for people. First of all, long-term

Radar Guidance Technology and the Solution of Social Problems in China

Radar Guidance Technology and the Solution of Social Problems in China Radar Guidance Technology of Guiding Missile to Target by Radar Guidance

Smart technology for synchronized 3D printing of concrete

This method of concurrent 3D-printing, known as swarm printing, paves the way for a team of mobile robots to print even bigger structures in future. Devel

OKRA CEO helps shape EU governance of AI Business Weekly Technology News Business news

The CEO of a Cambridge-based Artificial Intelligence startup is helping to fashion pan-European governance and regulation of the rocketing technology segme

The future of digital marketing: AI vs. human copywriters Articles Chief Technology Officer

Some companies have already witnessed the advantages of such collaboration. IBM, for instance, is scaling up its experience in AI-powered advertising and c

ELI5: The Ins & Outs of Blockchain Technology

So, what is Blockchain?(Note: There are a variety of different blockchain implementations that exist today. For the sake of keeping it simple this section

blockchain use cases that highlight the real potential of the technology

Blockchain might be a cryptocurrency enabler, but its real potential goes beyond digital coins. Viable blockchain use cases have proven that the distribute

The Big Data of Selling a Home

Big data is just now being used in the world of real estate. Put to good use, however, using machine learning to crunch numbers and observe data, such as h

Technology Won’t Unravel the Mystery of Jamal Khashoggi’s Disappearance

When Washington Post columnist Jamal Khashoggi walked into the Saudi Arabia Consulate in Istanbul on October 2, he reportedly handed his fiancé Hatice Ceng

AlbertS Home of Technology

前言

程式碼

原始檔案

執行效果(ko_untranslated.txt檔案)

總結

相關推薦