Linux使用技巧7--GBK轉成UTF-8
阿新 • • 發佈:2018-12-21
檔案的內容編碼的轉換
Windows系統中編輯的Java原始碼,在Linux下開啟會出現中文亂碼的情況。原因就是檔案編碼格式的問題,Windows下通常是GBK而Linux下是UTF-8。
在vim中用set fileencoding命令就可以看出編碼格式,如下:
//linux下fileencoding=utf-8//windows下fileencoding=latin1
- 1
- 2
- 3
- 4
最簡單的辦法就是在windows下將檔案另存為utf8格式。那麼在linux下我們可以使用iconv工具將其轉換格式。
$ iconv --helpUsage: iconv [OPTION...] [FILE...]Convert encoding of given files from one encoding to another. Input/Output format specification: -f, --from-code=NAME encoding of original text -t, --to-code=NAME encoding for output Information: -l, --list list all known coded character sets Output control: -c omit invalid characters from output -o, --output=FILE output file -s, --silent suppress warnings --verbose print progress information -?, --help Give this help list --usage Give a short usage message -V, --version Print program version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
$ iconv -f GBK -t UTF-8 test.java -o test2.java
- 1
轉換完畢,中文亂碼就不見了。
檔案的批量轉換
$ find com -type d -exec mkdir -p com2/{} \;
- 1
- 2
2.轉換
$ find com -type f -exec iconv -f GBK -t UTF-8 {} -o com2/{} \;
- 1
檔案/資料夾名的轉換
這就要用到convmv工具了。
$ convmvYour Perl version has fleas #22111 #37757 #49830 convmv 1.15 - converts filenames from one encoding to anotherCopyright (C) 2003-2011 Bjoern JACKE <[email protected]> USAGE: convmv [options] FILE(S)-f enc encoding *from* which should be converted-t enc encoding *to* which should be converted-r recursively go through directories-i interactive mode (ask for each action)--nfc target files will be normalization form C for UTF-8 (Linux etc.)--nfd target files will be normalization form D for UTF-8 (OS X etc.)--qfrom be quiet about the "from" of a rename (if it screws up your terminal e.g.)--qto be quiet about the "to" of a rename (if it screws up your terminal e.g.)--exec c execute command instead of rename (use #1 and #2 and see man page)--list list all available encodings--lowmem keep memory footprint low (see man page)--nosmart ignore if files already seem to be UTF-8 and convert if posible--notest actually do rename the files--replace will replace files if they are equal--unescape convert%20ugly%20escape%20sequences--upper turn to upper case--lower turn to lower case--parsable write a parsable todo list (see man page)--help print this help
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
將tech目錄下的資料夾或檔案遞迴轉換:
sudo convmv -f gbk -t utf-8 -r --notest tech/
- 1
另外需要注意,有時候在windows上用zip壓縮時也會帶來亂碼問題。