Intellij Idea如何識別某個檔案的編碼方案

阿新 • • 發佈：2018-12-25

Today we would like to answer the most frequent questions about file encodings in the IDE and show you a few tricks, which may help you to avoid potential pitfalls.

What is the problem with file encodings?

To be able to display the text correctly, IntelliJ IDEA needs to know which file encoding to use. Unfortunately, it is not always possible to tell the file encoding without additional information. Especially when single-byte encodings are used, there are multiple mappings possible.

However, things look better for UTF-family encodings. The UTF family consists of:

Several multi-byte encodings like UTF-16 or UTF-32, which are easily detectable by the BOM (Byte Order Mark) word in the beginning of the file.
The UTF-8 variable-bytes-per-character encoding which also can be auto-detected either by optional BOM or some specific byte combinations.

In particular, for an English character subset, the UTF-8 encoded file looks exactly like old plain ASCII text. That’s why UTF-8 is so popular and that’s why it’s the most preferred encoding.

How does the IDE determine encoding for the file?

IntelliJ IDEA uses multi-stage educated guessing, from most obvious to far-stretched.

First, if the BOM present, use the corresponding UTF-family encoding. Check if the file type declares the encoding itself and use that. For example, JSP files can specify the encoding right in the text:

Check if you have specified the encoding explicitly and use that. You can specify the desired encoding for the file or for the containing directory or for the whole project or for the IDE. IntelliJ IDEA will use the most specific encoding:

Try to figure out the encoding using some hints or heuristics. For example, when Auto-detect UTF-8 is selected, the IDE will analyze the file looking for some byte combinations which are UTF-8-specific.

Finally, use the project-level or, if the project is unavailable, the application-level encoding.
See Settings → File Encoding → Project Encoding → IDE Encoding.

What happens when I try to change the file encoding?

If the file encodings are completely compatible for this text, e.g. when changing English characters text from US_ASCII to UTF-8, IntelliJ IDEA just silently re-assigns encoding.

However, if the encodings are sufficiently different, the IDE have to ask you:

Whether you want to reload the file from disk in the other encoding.
In this case IDEA will replace editor with text from the file decoded with the new encoding.
Or you would like to convert the text on the editor to the file using the other encoding.
Here, IDEA will encode the text in the editor window using the new encoding and overwrite the file.

Please note these little gray exclamation marks, meaning that that particular conversion/reload can cause information loss.

For example when you try to reload UTF-8 encoded file with the US-ASCII encoding, losing the non-english characters in the process.

Or when you try to save the German umlauts to the plain text ISO-8859-1 file.

What else IntelliJ IDEA can do for me?

IntelliJ IDEA will warn you when you try to swear in German in an ASCII-only document:

To enable this inspection, go to Settings → Inspections → Lossy Encoding.

Likewise, IntelliJ IDEA will try to detect the situation when you load rich-encoded text with incompatible encoding:

What is the ultimate advice you have regarding file encodings?

To avoid any problems with file encoding we strongly recommend to use UTF-8.

That’s all for today and we hope this article was useful for you!

Develop with Pleasure!

轉載自：http://blog.jetbrains.com/idea/2013/03/use-the-utf-8-luke-file-encodings-in-intellij-idea/

Intellij Idea如何識別某個檔案的編碼方案

What is the problem with file encodings?

How does the IDE determine encoding for the file?

What happens when I try to change the file encoding?

What else IntelliJ IDEA can do for me?

What is the ultimate advice you have regarding file encodings?

修改 IntelliJ IDEA 的預設檔案編碼

Intellij Idea如何識別某個檔案的編碼方案

IntelliJ IDEA檢視指定檔案的檔案型別、修改檔案型別、解決無法正確識別檔案型別的問題

idea 執行java檔案編碼格式錯誤問題

IntelliJ IDEA寫JSP檔案出現“cannot resolve method”解決辦法

IDEA的安裝、註冊碼、建立專案、配置環境、配置Struts2模組、Debug使用、十大特徵，Eclipse及IntelliJ IDEA的xml檔案的建立

IntelliJ IDEA 編譯時報錯:“編碼UTF8的不可對映字元”和"未結束的字串字面值"

intellij idea控制檯輸出亂碼解決方案

IntelliJ IDEA 控制檯中文亂碼解決方案

妙用 Intellij IDEA 建立臨時檔案，Git 跟蹤不到的那種

IntelliJ IDEA 主題、字型、編輯區主題、檔案編碼修改

IntelliJ IDEA 在使用Subversion進行版本管理時，怎麼忽略某個檔案或者資料夾

IntelliJ IDEA的檔案編碼處理

IntelliJ IDEA 的默認文件編碼

IntelliJ IDEA使用alt+enter無法自動import某個類，手動impport也沒有這個類的提示

修改 IntelliJ IDEA 的默認文件編碼

Intellij IDEA錯誤識別.xml文件

IntelliJ IDEA中用git提交程式碼時忽略檔案的設定

IntelliJ IDEA（編碼篇）：快速生成實體類

IntelliJ Idea使用筆記2.mapper檔案提示：No data sources are configured to run this sq警告

Intellij Idea如何識別某個檔案的編碼方案

What is the problem with file encodings?

How does the IDE determine encoding for the file?

What happens when I try to change the file encoding?

What else IntelliJ IDEA can do for me?

What is the ultimate advice you have regarding file encodings?

相關推薦