【miscellaneous】【C/C++語言】UTF8與GBK字元編碼之間的相互轉換

阿新 • • 發佈：2019-01-11

  1 class CChineseCode  
  2 
  3 {  
  4 
  5   public:  
  6 
  7       static void UTF_8ToUnicode(wchar_t* pOut,char *pText);  // 把UTF-8轉換成Unicode  
  8 
  9       static void UnicodeToUTF_8(char* pOut,wchar_t* pText);  //Unicode 轉換成UTF-8  
 10 
 11       static void UnicodeToGB2312(char* pOut,wchar_t uData);  // 
 把Unicode 轉換成 GB2312    
 12 
 13       static void Gb2312ToUnicode(wchar_t* pOut,char *gbBuffer);// GB2312 轉換成　Unicode  
 14 
 15       static void GB2312ToUTF_8(string& pOut,char *pText, int pLen);//GB2312 轉為 UTF-8  
 16 
 17       static void UTF_8ToGB2312(string &pOut, char *pText, int pLen);// 
UTF-8 轉為 GB2312  
 18 
 19 };  
 20 
 21 類實現  
 22 
 23 void CChineseCode::UTF_8ToUnicode(wchar_t* pOut,char *pText)  
 24 
 25 {  
 26 
 27    char* uchar = (char *)pOut;  
 28 
 29    uchar[1] = ((pText[0] & 0x0F) << 4) + ((pText[1] >> 2) & 0x0F);  
 30 
 31    uchar[0] = ((pText[1] & 0x03 
) << 6) + (pText[2] & 0x3F);  
 32 
 33    return;  
 34 
 35 }  
 36 
 37 void CChineseCode::UnicodeToUTF_8(char* pOut,wchar_t* pText)  
 38 
 39 {  
 40 
 41    // 注意 WCHAR高低字的順序,低位元組在前，高位元組在後  
 42 
 43    char* pchar = (char *)pText;  
 44 
 45    pOut[0] = (0xE0 | ((pchar[1] & 0xF0) >> 4));  
 46 
 47    pOut[1] = (0x80 | ((pchar[1] & 0x0F) << 2)) + ((pchar[0] & 0xC0) >> 6);  
 48 
 49    pOut[2] = (0x80 | (pchar[0] & 0x3F));  
 50 
 51    return;  
 52 
 53 }  
 54 
 55 void CChineseCode::UnicodeToGB2312(char* pOut,wchar_t uData)  
 56 
 57 {  
 58 
 59   WideCharToMultiByte(CP_ACP,NULL,&uData,1,pOut,sizeof(wchar_t),NULL,NULL);  
 60 
 61    return;  
 62 
 63 }        
 64 
 65 void CChineseCode::Gb2312ToUnicode(wchar_t* pOut,char *gbBuffer)  
 66 
 67 {  
 68 
 69    ::MultiByteToWideChar(CP_ACP,MB_PRECOMPOSED,gbBuffer,2,pOut,1);  
 70 
 71    return ;  
 72 
 73 }  
 74 
 75 void CChineseCode::GB2312ToUTF_8(string& pOut,char *pText, int pLen)  
 76 
 77 {  
 78 
 79    char buf[4];  
 80 
 81    int nLength = pLen* 3;  
 82 
 83    char* rst = new char[nLength];  
 84 
 85    memset(buf,0,4);  
 86 
 87    memset(rst,0,nLength);  
 88 
 89    int i = 0;  
 90 
 91    int j = 0;        
 92 
 93    while(i < pLen)  
 94 
 95    {  
 96 
 97            //如果是英文直接複製就可以  
 98 
 99            if( *(pText + i) >= 0)  
100 
101            {  
102 
103                    rst[j++] = pText[i++];  
104 
105            }  
106 
107            else  
108 
109            {  
110 
111                    wchar_t pbuffer;  
112 
113                    Gb2312ToUnicode(&pbuffer,pText+i);  
114 
115                    UnicodeToUTF_8(buf,&pbuffer);  
116 
117                    unsigned short int tmp = 0;  
118 
119                    tmp = rst[j] = buf[0];  
120 
121                    tmp = rst[j+1] = buf[1];  
122 
123                    tmp = rst[j+2] = buf[2];      
124 
125                    j += 3;  
126 
127                    i += 2;  
128 
129            }  
130 
131    }  
132 
133    rst[j] = '';  
134 
135    //返回結果  
136 
137    pOut = rst;                
138 
139    delete []rst;     
140 
141    return;  
142 
143 }  
144 
145 void CChineseCode::UTF_8ToGB2312(string &pOut, char *pText, int pLen)  
146 
147 {  
148 
149    char * newBuf = new char[pLen];  
150 
151    char Ctemp[4];  
152 
153    memset(Ctemp,0,4);  
154 
155    int i =0;  
156 
157    int j = 0;  
158 
159    while(i < pLen)  
160 
161    {  
162 
163        if(pText > 0)  
164 
165        {  
166 
167                newBuf[j++] = pText[i++];                          
168 
169        }  
170 
171        else                    
172 
173        {  
174 
175                WCHAR Wtemp;  
176 
177                UTF_8ToUnicode(&Wtemp,pText + i);        
178 
179                UnicodeToGB2312(Ctemp,Wtemp);                
180 
181                newBuf[j] = Ctemp[0];  
182 
183                newBuf[j + 1] = Ctemp[1];    
184 
185                i += 3;      
186 
187                j += 2;     
188 
189        }  
190 
191    }  
192 
193    newBuf[j] = '';    
194 
195    pOut = newBuf;    
196 
197    delete []newBuf;  
198 
199    return;    
200 
201 }

【miscellaneous】【C/C++語言】UTF8與GBK字元編碼之間的相互轉換

1 class CChineseCode 2 3 { 4 5 public: 6 7 static void UTF_8ToUnicode(wchar_t* pOut,char *pText); // 把UTF-8轉換成Unicode

C語言中字串與各數值型別之間的轉換

c語言的演算法設計中，經常會需要用到字串，而由於c語言中字串並不是一個預設型別，其標準庫stdlib設計了很多函式方便我們處理字串與其他數值型別之間的轉換。首先放上一段展示各函式使用的程式碼，大家也可以copy到自己的機器上執行觀察#include <stdio.h&g

【Python】utf8,unicode,ascii編碼的相互轉換

（linux系統為例）中文字元：腳對應編碼如下： utf8編碼： unicode編碼：(引號前有 u) ascii編碼：【1】unicode與ascii互轉涉及函式：ord()與 ch

【Python】字元數字之間的轉換函式

int(x [,base ]) 將x轉換為一個整數 long(x [,base ]) 將x轉換為一個長整數 float(x )

C 語言---數字與中文大寫數字之間的轉換（實用）

#include <stdio.h> #include <stdlib.h> #include <string.h> int main() { int i=0,count=0; //計數器 char

C語言實現任意進位制數之間的轉換

使用C語言程式設計實現任意進位制數（2-16進位制）轉換為其他任意進位制數（2-16進位制），實驗平臺：vs2015。 #include<stdio.h> #include<str

C# DataTable 和List之間相互轉換的方法

dbn execute 屬性 ins 集合方法 summary efault getprop 一、List<T>/IEnumerable轉換到DataTable/DataView private DataTable ToDataTable<T>(

C#中父類和子類之間相互轉換

mage all spa 分享 mic ack 子類 utl round 所用到的類文件:Person.cs:Student.cs:Teacher.cs:問題1:總結:1 父類不能直接強制轉換成子類2 只有父類對象指向子類，那麽父類是可以強制轉換成子類，如果父類對象沒有指向

C--十六進位制整形和字串的相互轉換

前言十六進位制整形：6B746d656d6f7869616667650007e26B 十進位制字串：“ktiemoxiaoge ‘\07’ '\226’k" 十六進位制字串“6B746d656d6f7869616667650007e26B” 有什麼用呢？？在網路通訊中，常常要制定一定的

C#字串、位元組陣列和記憶體流間的相互轉換 ASCII碼錶

定義string變數為str,記憶體流變數為ms,位元陣列為bt 1.字串=>位元陣列 (1)byte[] bt=System.Text.Encoding.Default.GetBytes("字串"); (2)byte[] bt=Convert.FromBase64Strin

C#程式設計練習（03）：北斗時間系統、GPS時間系統及其與UTC時間系統之間的轉換

需求說明：北斗周-周內秒轉化為日曆時，轉化為UTC時，轉化為GPS週週內秒 GPS周-周內秒轉化為日曆時，轉化為UTC時，轉化為北斗周-周內秒設計示意圖：原始碼： using System; using System.Collections.Generic; using S

C# 中Bitmap和Halcon中HObject資料型別的相互轉換

C# 中Bitmap和Halcon中HObject資料型別的相互轉換 public void Bitmap2HObjectBpp24(Bitm

C#長整型時間與java長整型時間轉換

最近在有一個解析並轉發病毒軟體日誌的活，這個軟體用的是SQLite嵌入式資料庫儲存病毒日誌。查詢病毒記錄後，我發現它用長整型儲存攻擊時間這個欄位，而且是一個10位的值。而我的解析系統是用C#寫的，C#的用來表示時間刻度的長整型一般都是18位的值，這讓我很是鬱悶

Xml字串與C#物件之間相互轉換

我們常常需要讀取xml檔案，把裡面的資訊轉化為我們自定義的型別，或則吧自定義型別轉化為Xml字串。在這裡介紹一個比較簡單的物件轉化方法。在我自己的Framwork裡面也多次用到。裡面涉及到節點、屬性、集合。示例一該xml檔案涉及到屬性、節點集合不涉及個節點： <?x

C++中string,char,int,double等資料型別的相互轉換及與ASCII碼的轉換

在程式設計時，經常會遇到資料型別的轉換，使用下面的方法可以實現任意string,char,int,double資料之間的轉換。 #include <sstream> #include <iostream> using namespace std; t

C++整型、浮點型與字符串型相互轉換

小數位數 tde 參考 std str using atof char size 前言整型、浮點型與字符串的相互轉換可以用自帶的函數來實現，本人使用的是vs2015，所以下面有些函數需要改變一下，請看下面的總結。正文一、整型轉字符串型 1. int轉為字符串

C# byte和10進制、16進制相互轉換

ray converter href byte[] csharp byte odin html odi 原文:C# byte和10進制、16進制相互轉換 var SRMP = new byte[4]; Array.Copy(Encoding.UTF8.GetBytes(

【C語言】統計數字在排序數組中出現的次數

語言個數統計 ret r+ () class tdi times //數字在排序數組中出現的次數。 //統計一個數字在排序數組中出現的次數。比如：排序數組｛1，2，3，3，3，3，4，5｝和數字3，因為3出現了4次，因此輸出4. #include <stdio

【網易】【作業】程序設計入門—C語言翁愷第二周

rate span asio tin bar ase read con hab #include<stdio.h> int main() { int a=0,b=0; scanf("%d",&a); if(a>=800)

【C語言】推斷一個數是否為2的n次方

post data- popu scanf scan ng- 輸入 ont print //推斷一個數是否為2的n次方 #include <stdio.h> int is_two_n(int num) { if ((num&(num - 1))

【miscellaneous】【C/C++語言】UTF8與GBK字元編碼之間的相互轉換

相關推薦