HashMap和ConcurrentHashMap淺析

阿新 • • 發佈：2019-02-15

HashMap

hashmap本質資料加連結串列。根據key取得hash值，然後計算出陣列下標，如果多個key對應到同一個下標，就用連結串列串起來，新插入的在前面。

看3段重要程式碼摘要：

a：

    public HashMap(int initialCapacity, float loadFactor) {
        int capacity = 1;
        while (capacity < initialCapacity)
            capacity <<= 1;

        this.loadFactor = loadFactor;
        threshold = (int)(capacity * loadFactor);
        table = new Entry[capacity];
        init();
    }

有3個關鍵引數：
capacity：容量，就是陣列大小
loadFactor：比例，用於擴容
threshold:=capacity*loadFactor 最多容納的Entry數，如果當前元素個數多於這個就要擴容（capacity擴大為原來的2倍）

    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

根據key算hash值，再根據hash值取得陣列下標，通過陣列下標取出連結串列，遍歷連結串列用equals取出對應key的value。

public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

從陣列（通過hash值）取得連結串列頭，然後通過equals比較key，如果相同，就覆蓋老的值，並返回老的值。（該key在hashmap中已存在）

否則新增一個entry，返回null。新增的元素為連結串列頭，以前相同陣列位置的掛在後面。

另外：modCount是為了避免讀取一批資料時，在迴圈讀取的過程中發生了修改，就拋異常

if (modCount != expectedModCount)
throw new ConcurrentModificationException();

下面看新增一個map元素

    void addEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 * table.length);
    }

新增後，如果發現size大於threshold了，就resize到原來的2倍

    void resize(int newCapacity) {

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable);
        table = newTable;
        threshold = (int)(newCapacity * loadFactor);
    }

新建一個數組，並將原來資料轉移過去

void transfer(Entry[] newTable) {
        Entry[] src = table;
        int newCapacity = newTable.length;
        for (int j = 0; j < src.length; j++) {
            Entry<K,V> e = src[j];
            if (e != null) {
                src[j] = null;
                do {
                    Entry<K,V> next = e.next;
                    int i = indexFor(e.hash, newCapacity);
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);
            }
        }
    }

將原來陣列中的連結串列一個個取出，然後遍歷連結串列中每個元素，重新計算index並放入新陣列。每個處理的也放連結串列頭。

在取出原來陣列連結串列後，將原來陣列置空（為了大資料量複製時更快的被垃圾回收？）

還有兩點注意：

static class Entry<K,V> implements Map.Entry<K,V>是hashmap的靜態內部類，iterator之類的是內部類，因為不是每個元素都需要持有map的this指標。

HashMap把 transient Entry[] table;等變數置為transient，然後override了readObject和writeObject，自己實現序列化。

ConcurrentHashMap：

在hashMap的基礎上，ConcurrentHashMap將資料分為多個segment，預設16個（concurrency level），然後每次操作對一個segment加鎖，避免多執行緒鎖得機率，提高併發效率。

  public V get(Object key) {
        int hash = hash(key.hashCode());
        return segmentFor(hash).get(key, hash);
    }

   final Segment<K,V> segmentFor(int hash) {
        return segments[(hash >>> segmentShift) & segmentMask];
    }

in class Segment：

   V get(Object key, int hash) {
            if (count != 0) { // read-volatile
                HashEntry<K,V> e = getFirst(hash);
                while (e != null) {
                    if (e.hash == hash && key.equals(e.key)) {
                        V v = e.value;
                        if (v != null)
                            return v;
                        return readValueUnderLock(e); // recheck
                    }
                    e = e.next;
                }
            }
            return null;
        }

        /**
         * Reads value field of an entry under lock. Called if value
         * field ever appears to be null. This is possible only if a
         * compiler happens to reorder a HashEntry initialization with
         * its table assignment, which is legal under memory model
         * but is not known to ever occur.
         */   
        V readValueUnderLock(HashEntry<K,V> e) {
            lock();
            try {
                return e.value;
            } finally {
                unlock();
            }
        }

注意，這裡在併發讀取時，除了key對應的value為null之外，並沒有使用鎖，如何做到沒有問題的呢，有以下3點：
1.       HashEntry<K,V> getFirst(int hash) {
            HashEntry<K,V>[] tab = table;
            return tab[hash & (tab.length - 1)];
        }
這裡如果在讀取時陣列大小（tab.length）發生變化，是會導致資料不對的，但transient volatile HashEntry<K,V>[] table;是volatile得，陣列大小變化能立刻知道

2.    static final class HashEntry<K,V> {
        final K key;
        final int hash;
        volatile V value;
        final HashEntry<K,V> next;
這裡next是final的，就保證了一旦HashEntry取出來，整個連結串列就是正確的。

3.value是volatile的，保證瞭如果有put覆蓋，是可以立刻看到的。

public V put(K key, V value) {
        if (value == null)
            throw new NullPointerException();
        int hash = hash(key.hashCode());
        return segmentFor(hash).put(key, hash, value, false);
    }

 V put(K key, int hash, V value, boolean onlyIfAbsent) {
            lock();
            try {
                int c = count;
                if (c++ > threshold) // ensure capacity
                    rehash();
                HashEntry<K,V>[] tab = table;
                int index = hash & (tab.length - 1);
                HashEntry<K,V> first = tab[index];
                HashEntry<K,V> e = first;
                while (e != null && (e.hash != hash || !key.equals(e.key)))
                    e = e.next;

                V oldValue;
                if (e != null) {
                    oldValue = e.value;
                    if (!onlyIfAbsent)
                        e.value = value;
                }
                else {
                    oldValue = null;
                    ++modCount;
                    tab[index] = new HashEntry<K,V>(key, hash, first, value);
                    count = c; // write-volatile
                }
                return oldValue;
            } finally {
                unlock();
            }
        }

這裡除了加鎖操作，其他和普通HashMap原理上無太大區別。

還有一點不理解的地方：

對於get和put/remove併發發生的時候，如果get的HashEntry<K,V> e = getFirst(hash);連結串列已經取出來了，這個時候put放入一個entry到連結串列頭，如果正好是需要取的key，是否還是會取不出來？

remove時，會先去除需要remove的key，然後把remove的key前面的元素一個個接到連結串列頭，同樣也存在remove後，以前的head到了中間，也會漏掉讀取的元素。

   ++modCount;
                        HashEntry<K,V> newFirst = e.next;
                        for (HashEntry<K,V> p = first; p != e; p = p.next)
                            newFirst = new HashEntry<K,V>(p.key, p.hash,
                                                          newFirst, p.value);
                        tab[index] = newFirst;
                        count = c; // write-volatile

HashMap和ConcurrentHashMap淺析

HashMap和ConcurrentHashMap淺析

HashMap和ConcurrentHashMap

深入剖析 Java7 中的 HashMap 和 ConcurrentHashMap

Java7、8中HashMap和ConcurrentHashMap原始碼閱讀

Java7、8中HashMap和ConcurrentHashMap源碼閱讀

【轉】Java7/8 中的 HashMap 和 ConcurrentHashMap 全解析

多執行緒(九)： HashTable、HashMap和ConcurrentHashMap

Java7/8中的 HashMap和ConcurrentHashMap全解析

HashMap和ConcurrentHashMap和HashTable的底層原理與剖析

Java7/8 中 HashMap 和 ConcurrentHashMap的對比和分析

Java7/8 中的 HashMap 和 ConcurrentHashMap 全解析

HashMap 和 ConcurrentHashMap原理

Java HashMap和ConcurrentHashMap原始碼解析

【搞定Java8新特性】之Java7/8 中的 HashMap 和 ConcurrentHashMap 全解析

HashMap和ConcurrentHashMap 原始碼關鍵點解析

HashMap和ConcurrentHashMap 源碼關鍵點解析

Java7/8中的HashMap和ConcurrentHashMap全解析

HashMap和ConcurrentHashMap原理及原始碼解讀

JDK1.8 中的hashmap和concurrentHashMap

多執行緒狀態下HashMap和ConCurrentHashMap的執行比較

HashMap和ConcurrentHashMap淺析

相關推薦