讀取html檔案內容亂碼處理
阿新 • • 發佈:2019-02-05
1.亂碼 先讀出 所有的位元組碼 然後在轉換成 需要的字串
正確方式:
ByteArrayOutputStream outHtml = new ByteArrayOutputStream();
InputStream inn = conn.getInputStream();
byte[] buffer = new byte[1024];
int len = 0;
while((len = inn .read(buffer))!= -1 ){
outHtml.write(buffer,0,len);
}
byte[] data = outHtml.toByteArray();
logger.info("轉換前utf-8:"+new String(data,"utf-8"));
InputStream inn = conn.getInputStream();
InputStream inputStream = new BufferedInputStream(inn);
StringBuffer htmlContent = new StringBuffer();
byte[] b = new byte[1024];
for (int n; (n = inputStream.read(b)) != -1;) {
htmlContent.append(new String(b, 0, n,"utf-8"));
}
logger.info("獲取時:"+htmlContent.toString());
正確方式:
ByteArrayOutputStream outHtml = new ByteArrayOutputStream();
InputStream inn = conn.getInputStream();
byte[] buffer = new byte[1024];
int len = 0;
while((len = inn .read(buffer))!= -1 ){
outHtml.write(buffer,0,len);
}
byte[] data = outHtml.toByteArray();
logger.info("轉換前utf-8:"+new String(data,"utf-8"));
錯誤方式: 是什麼導致的亂碼呢 為啥本地環境不亂碼 到執行環境就亂碼呢 難道僅僅是因為 可能讀取不全位元組 轉string 時出現的轉碼錯誤?
InputStream inputStream = new BufferedInputStream(inn);
StringBuffer htmlContent = new StringBuffer();
byte[] b = new byte[1024];
for (int n; (n = inputStream.read(b)) != -1;) {
htmlContent.append(new String(b, 0, n,"utf-8"));
}
logger.info("獲取時:"+htmlContent.toString());