1. 程式人生 > >避免unicode字符被截斷的方法

避免unicode字符被截斷的方法

和數 數字 with form har ret pos int capacity

NSString *str = @"????????";
NSRange range = NSMakeRange(2, str.length - 2);
NSString *subStr = [str substringWithRange:range];

這裏的str.length = 8,因為字符串是unicode格式,一個字符是4個字節組合表示的。

- (NSString *)utf8ToUnicode:(NSString *)string{
    
    NSUInteger length = [string length];
    NSMutableString *str = [NSMutableString stringWithCapacity:0
]; for (int i = 0;i < length; i++){ NSMutableString *s = [NSMutableString stringWithCapacity:0]; unichar _char = [string characterAtIndex:i]; // 判斷是否為英文和數字 if (_char <= 9 && _char >=0){ [s appendFormat:@"%@",[string substringWithRange:NSMakeRange(i,1
)]]; }else if(_char >=a && _char <= z){ [s appendFormat:@"%@",[string substringWithRange:NSMakeRange(i,1)]]; }else if(_char >=A && _char <= Z) { [s appendFormat:@"%@",[string substringWithRange:NSMakeRange(i,1)]]; }
else{ // 中文和字符 [s appendFormat:@"\\u%x",[string characterAtIndex:i]]; // 不足位數補0 否則解碼不成功 if (s.length == 4) { [s insertString:@"00" atIndex:2]; } else if (s.length == 5) { [s insertString:@"0" atIndex:2]; } } [str appendFormat:@"%@", s]; } return str; }

NSString *strB = [self utf8ToUnicode:str];

將中文轉成unicode形式,strB = @"\ud83d\udc74\ud83c\udffb\ud83d\udc6e\ud83c\udffd";

那subStr=@"\ud83c\udffb\ud83d\udc6e\ud83c\udffd";但是因為被截斷後,打印出來變成了??????,而如果range.location從1開始,出現都是unicode形式的字符串,因為被截斷後沒有對應的組合字符串可以顯示。可以利用循環打印下結果

for (int i = 0; i < str.length; i++) {
        NSRange range = NSMakeRange(i, str.length - i);
        NSString *temp = [str substringWithRange:range];
        NSLog(@"temp = %@", temp);
}

/*
     temp = ????????
     temp = \udc74\ud83c\udffb\ud83d\udc6e\ud83c\udffd
     temp = ??????
     temp = \udffb\ud83d\udc6e\ud83c\udffd
     temp = ????
     temp = \udc6e\ud83c\udffd
     temp = ??
     temp = \udffd
*/

結果可能不是我們想要的,我們如果想要截斷後的字符為顯示字符的子集,也就是????或????或者????????。那麽可以使用

rangeOfComposedCharacterSequencesForRange:調整range,防止有效的unicode字符被截斷成無效字符(無顯示意義)。請看下面

for (int i = 0; i < str.length; i++) {
        NSRange range = NSMakeRange(i, str.length - i);
        range = [str rangeOfComposedCharacterSequencesForRange:range];
        NSString *temp = [str substringWithRange:range];
        NSLog(@"temp = %@", temp);
}

/*
     temp = ????????
     temp = ????????
     temp = ????????
     temp = ????????
     temp = ????
     temp = ????
     temp = ????
     temp = ????
*/

避免unicode字符被截斷的方法