java中替換\r\n遇到的坑
阿新 • • 發佈:2018-11-28
本人在專案中需要將資料庫中的\r\n轉換成html頁面可識別的<br />,於是使用了text.replaceAll("(\\r\\n|\\r|\\n|\\n\\r)", "<br />");來進行替換,發現竟然替換不了!!!列印輸出的內容中毫無變化。
然後嘗試換其他的方法Pattern替換等等,依舊無效。
後來偶然查看了下replaceAll 的原始碼,瞬間豁然開朗!程式碼如下:(解決方案在最後!!)
public String replaceAll(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceAll(replacement); }
public String replaceAll(String replacement) { reset(); boolean result = find(); if (result) { StringBuffer sb = new StringBuffer(); do { appendReplacement(sb, replacement); result = find(); } while (result); appendTail(sb); return sb.toString(); } return text.toString(); }
關鍵來了!!!
public Matcher appendReplacement(StringBuffer sb, String replacement) { // If no match, return error if (first < 0) throw new IllegalStateException("No match available"); // Process substitution string to replace group references with groups int cursor = 0; StringBuilder result = new StringBuilder(); while (cursor < replacement.length()) { char nextChar = replacement.charAt(cursor); if (nextChar == '\\') { cursor++; nextChar = replacement.charAt(cursor); result.append(nextChar); cursor++; } else if (nextChar == '$') { // Skip past $ cursor++; // A StringIndexOutOfBoundsException is thrown if // this "$" is the last character in replacement // string in current implementation, a IAE might be // more appropriate. nextChar = replacement.charAt(cursor); int refNum = -1; if (nextChar == '{') { cursor++; StringBuilder gsb = new StringBuilder(); while (cursor < replacement.length()) { nextChar = replacement.charAt(cursor); if (ASCII.isLower(nextChar) || ASCII.isUpper(nextChar) || ASCII.isDigit(nextChar)) { gsb.append(nextChar); cursor++; } else { break; } } if (gsb.length() == 0) throw new IllegalArgumentException( "named capturing group has 0 length name"); if (nextChar != '}') throw new IllegalArgumentException( "named capturing group is missing trailing '}'"); String gname = gsb.toString(); if (ASCII.isDigit(gname.charAt(0))) throw new IllegalArgumentException( "capturing group name {" + gname + "} starts with digit character"); if (!parentPattern.namedGroups().containsKey(gname)) throw new IllegalArgumentException( "No group with name {" + gname + "}"); refNum = parentPattern.namedGroups().get(gname); cursor++; } else { // The first number is always a group refNum = (int)nextChar - '0'; if ((refNum < 0)||(refNum > 9)) throw new IllegalArgumentException( "Illegal group reference"); cursor++; // Capture the largest legal group string boolean done = false; while (!done) { if (cursor >= replacement.length()) { break; } int nextDigit = replacement.charAt(cursor) - '0'; if ((nextDigit < 0)||(nextDigit > 9)) { // not a number break; } int newRefNum = (refNum * 10) + nextDigit; if (groupCount() < newRefNum) { done = true; } else { refNum = newRefNum; cursor++; } } } // Append group if (start(refNum) != -1 && end(refNum) != -1) result.append(text, start(refNum), end(refNum)); } else { result.append(nextChar); cursor++; } } // Append the intervening text sb.append(text, lastAppendPosition, first); // Append the match substitution sb.append(result); lastAppendPosition = last; return this; }
從上面這段程式碼中我們發現這樣一個片段
char nextChar = replacement.charAt(cursor);
if (nextChar == '\\') {
cursor++;
nextChar = replacement.charAt(cursor);
result.append(nextChar);
cursor++;
}
也就是說它把連續的兩個反斜槓(‘\\’)變成了一個反斜槓(‘\’)
到這裡我想各位都已經明白了,咱們要替換的‘\\r’應該用‘\\\\r’來代替,用連續的4個反斜槓
也就是text.replaceAll("(\\\\r\\\\n|\\\\r|\\\\n|\\\\n\\\\r)", "<br />");
經過測試,這行程式碼完美的替換掉了內容中的換行字元。
這個問題坑了我快一個小時,放到這裡供各位參考。