emoji表情與unicode編碼互轉的實現(JS,JAVA,C#)
阿新 • • 發佈:2021-01-06
前幾天剛好有需求要把emoji對應的Unicode編碼轉換成文字,比如1f601對應的這個笑臉😁,但沒有找到C#的把1f601轉換成文字的方法,用Encoding.Unicode怎麼轉換都不對,最後直接複製emoji字元,Visual Studio裡面竟然直接顯示出來了,那就直接用字元吧,都不用轉換了,然後不了了之了。
今天搞Markdown編輯器,由於前面GFM的原因,又對編碼進行測試,沒查到什麼靠譜資料,到時找到很多emoji和Unicode對照表,https://apps.timwhitlock.info/emoji/tables/unicode拿一個笑臉https://apps.timwhitlock.info/unicode/inspect/hex/1F601開刀~
1.表情字元轉編碼
【C#】
Encoding.UTF32.GetBytes("😁") -> ["1","f6","1","0"]
【js】
"😁".codePointAt(0).toString(16) -> 1f601
【java】
byte[] bytes = "😀".getBytes("utf-32"); System.out.println(getBytesCode(bytes)); private static String getBytesCode(byte[] bytes) { String code = ""; for (byte b : bytes) { code += "\\x" + Integer.toHexString(b & 0xff); } return code; }
UTF-32結果一致
【C#】
Encoding.UTF8.GetBytes("😁") -> ["f0","9f","98","81"]
【js】
encodeURIComponent("😁") -> %F0%9F%98%81
UTF-8結果一致
2.編碼轉表情字元
【js】
String.fromCodePoint('0x1f601') utf-32
【java】
String emojiName = "1f601"; //其實4個位元組 int emojiCode = Integer.valueOf(emojiName,16); byte[] emojiBytes = int2bytes(emojiCode); String emojiChar = new String(emojiBytes,"utf-32"); System.out.println(emojiChar); public static byte[] int2bytes(int num){ byte[] result = new byte[4]; result[0] = (byte)((num >>> 24) & 0xff);//說明一 result[1] = (byte)((num >>> 16)& 0xff ); result[2] = (byte)((num >>> 8) & 0xff ); result[3] = (byte)((num >>> 0) & 0xff ); return result; }
c# 漢字和Unicode編碼互相轉換例項
/// <summary> /// <summary> /// 字串轉Unicode /// </summary> /// <param name="source">源字串</param> /// <returns>Unicode編碼後的字串</returns> public static string String2Unicode(string source) { byte[] bytes = Encoding.Unicode.GetBytes(source); StringBuilder stringBuilder = new StringBuilder(); for (int i = 0; i < bytes.Length; i += 2) { stringBuilder.AppendFormat("\\u{0}{1}",bytes[i + 1].ToString("x").PadLeft(2,'0'),bytes[i].ToString("x").PadLeft(2,'0')); } return stringBuilder.ToString(); } /// <summary> /// Unicode轉字串 /// </summary> /// <param name="source">經過Unicode編碼的字串</param> /// <returns>正常字串</returns> public static string Unicode2String(string source) { return new Regex(@"\\u([0-9A-F]{4})",RegexOptions.IgnoreCase | RegexOptions.Compiled).Replace( source,x => string.Empty + Convert.ToChar(Convert.ToUInt16(x.Result("$1"),16))); }
參考地址:
https://www.jianshu.com/p/8a416537deb3
https://blog.csdn.net/a19881029/article/details/13511729
https://apps.timwhitlock.info/emoji/tables/unicode
到此這篇關於emoji表情與unicode編碼互轉的實現(JS,JAVA,C#)的文章就介紹到這了,更多相關emoji表情與unicode編碼互轉內容請搜尋我們以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援我們!