1. 程式人生 > >String源碼分析

String源碼分析

長度 null 分析 nta 完成 bounds pty per 字節數組

一、類定義

public final class String implements java.io.Serializable, Comparable<String>, CharSequence {...}
  1. final型,表示不能被繼承,並對象初始化後不能改變。
  2. 實現了Serializable,表示可以序列化和反序列化。
  3. 實現了Comparable,表示需要完成compareTo(String s)方法,用於比較
  4. 實現了CharSequence,包含了length():int , charAt(int):char,subSequence(int,int):CharSequece,toString():String,chars():intStream,codePoints():IntStream.

二、成員變量

private final char value[];
private int hash;
private static final long serialVersionUID = -6849794470754667710L;
private static final ObjectStreamField[] serialPersistentFields = new ObjectStreamField[0];
public static final Comparator<String> CASE_INSENSITIVE_ORDER = new CaseInsensitiveComparator();

value 作為string的底層實現,為字符數組。

hash 為字符串的hashcode

serialVersionUID 作為系列化和反序列化的標誌

serialPersistentFields ObjectStreamFields數組用來聲明一個類的序列化字段。類中未使用

CASE_INSENSITIVE_ORDER 用於做無大小寫排序用的比較器,一個內部類生成的比較器

三、方法

2.1 構造方法

(1)字符串作為參數

public String(){ this.value = "".value};
public String(String original){
  this.value=original.value; 
  this.hash=original.hash;
}

用一個String類型的對象來初始化一個String。這裏將直接將源String中的value和hash兩個屬性直接賦值給目標String。

(2)字符數組作為參數

public String(char value[]){
  this.value=Arrays.copyOf(value, value.length)
}

public String(char value[],int offest, int count){
  if(offest<0){
    throw new StringIndexOutOfBoundsException(count);
  }
  if(offest <=0){
    if(count<0){throw new StringIndexOutOfBoundsException(count);}
    if(offest<=value.length){this.value = "".vlaue; return;}
  }
  if(offest>value.length-count){
    throw new StringIndexOutOfBoundsException(offset+ count);
  }
  this.value = Arrays.copyOfRange(value,offset,offset+count);
}

當我們使用字符數組創建String的時候,會用到Arrays.copyOf方法和Arrays.copyOfRange方法。這兩個方法是將原有的字符數組中的內容逐一的復制到String中的字符數組中。

(3)int數組作為參數

public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }

(4) 字節數組作為參數

public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
        this(bytes, 0, bytes.length, charsetName);
    }

public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }

public String(byte bytes[], int offset, int length) {
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(bytes, offset, length);
    }

public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }

byte是網絡傳輸或存儲的序列化形式。byte[]和String之間的相互轉換就不得不關註編碼問題。String(byte[] bytes, Charset charset)是指通過charset來解碼指定的byte數組,將其解碼成unicode的char[]數組,夠造成新的String。 其中都用到了decode函數,具體如下:

static char[] decode(String charsetName, byte[] ba, int off, int len)
        throws UnsupportedEncodingException
    {
        StringDecoder sd = deref(decoder);
        String csn = (charsetName == null) ? "ISO-8859-1" : charsetName;
        if ((sd == null) || !(csn.equals(sd.requestedCharsetName())
                              || csn.equals(sd.charsetName()))) {
            sd = null;
            try {
                Charset cs = lookupCharset(csn);
                if (cs != null)
                    sd = new StringDecoder(cs, csn);
            } catch (IllegalCharsetNameException x) {}
            if (sd == null)
                throw new UnsupportedEncodingException(csn);
            set(decoder, sd);
        }
        return sd.decode(ba, off, len);
    }

可以如是不指定字符集的話,則會用默認的ISO-8859-1字符集解碼

(5)StringBuffer和StringBulider作為參數

public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

關於效率問題,Java的官方文檔有提到說使用StringBuilder的toString方法會更快一些,原因是StringBuffer的toString方法是synchronized的,在犧牲了效率的情況下保證了線程安全。

2.2 常用方法

length() 返回字符串長度

isEmpty() 返回字符串是否為空

charAt(int index) 返回字符串中第(index+1)個字符

char[] toCharArray() 轉化成字符數組

trim() 去掉兩端空格

toUpperCase() 轉化為大寫

toLowerCase() 轉化為小寫

String concat(String str) //拼接字符串

String replace(char oldChar, char newChar) //將字符串中的oldChar字符換成newChar字符

//以上兩個方法都使用了String(char[] value, boolean share);

boolean matches(String regex) //判斷字符串是否匹配給定的regex正則表達式

boolean contains(CharSequence s) //判斷字符串是否包含字符序列s

String[] split(String regex, int limit) 按照字符regex將字符串分成limit份。

String[] split(String regex)

getBytes

public byte[] getBytes(String charsetName)throws UnsupportedEncodingException {
        if (charsetName == null) throw new NullPointerException();
        return StringCoding.encode(charsetName, value, 0, value.length);
    }

public byte[] getBytes(Charset charset) {
        if (charset == null) throw new NullPointerException();
        return StringCoding.encode(charset, value, 0, value.length);
    }

比較方法

boolean equals(Object anObject);
boolean contentEquals(StringBuffer sb);
boolean contentEquals(CharSequence cs);
boolean equalsIgnoreCase(String anotherString);
int compareTo(String anotherString);
int compareToIgnoreCase(String str);
boolean regionMatches(int toffset, String other, int ooffset,int len)  //局部匹配
boolean regionMatches(boolean ignoreCase, int toffset,String other, int ooffset, int len)   //局部匹配

其中比較有特點的:

public boolean equals(Object anObject) {  
        if (this == anObject) {  //判斷兩個對象是否是指向同一內存地址的
            return true;
        }
        if (anObject instanceof String) {  //判斷兩個字符串的值是否相同
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

其中的局部匹配使用 參考

判斷字符串開始結束字符串

public boolean startsWith(String prefix, int toffset) {  //prefix前綴, toffset開始比較的位置
        char ta[] = value;
        int to = toffset;
        char pa[] = prefix.value;
        int po = 0;
        int pc = prefix.value.length;
        // Note: toffset might be near -1>>>1.
        if ((toffset < 0) || (toffset > value.length - pc)) {
            return false;
        }
        while (--pc >= 0) {
            if (ta[to++] != pa[po++]) {
                return false;
            }
        }
        return true;
    }
同理有:
public boolean startsWith(String prefix){}
public boolean endsWith(String suffix) {return startsWith(suffix, value.length - suffix.value.length);}

四、總結

String對象是不可改變的,賦值給字符串引用以新的引用時,實際是改變其指向的內存地址,但是原內存的值是沒有改變的。

String源碼分析