基於JDK1.7的HashMap原始碼詳解

阿新 • • 發佈：2019-01-10

如有不對的地方，請指出，謝謝！

一、HashMap概述

HashMap是基於雜湊表的Map介面實現，此實現提供所有可選的對映操作，並允許使用null值和null鍵。HashMap與HashTable的作用大致相同，但是它不是執行緒安全的。此類不保證對映的順序，特別是它不保證該順序恆久不變。

遍歷HashMap的時間複雜度與其的容量(capacity)和現有元素的個數（size）成正比。如果要保證遍歷的高效性，初始容量（capacity)不能設定太高或者平衡因子（load factor）不能設定太低。

二、HashMap的結構

HashMap是基於資料結構雜湊表實現的，雜湊表使用陣列來儲存元素，並使用鏈地址法來處理衝突（有多種方式處理衝突）。

從上圖可以看出，陣列中每一項都是一個單向連結串列。

原始碼如下：

//存放連結串列的陣列
transient Entry[] table;  
//鍵值對，持有指向下一個Entry的引用，由此構成單向連結串列
static class Entry<K,V> implements Map.Entry<K,V> {  
    final K key;  
    V value;  
    //指向下一節點
    Entry<K,V> next;  
    final int hash;  
    ……  
}

三、構造器與屬性

先來看看HashMap有哪些屬性

    /**
     * 預設的初始化桶數量，HashMap中桶數量的值必須是2的N次冪
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * HashMap中雜湊桶數量的最大值，1073741824
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * 預設的負載因子，當HashMap中元素的數量達到容量的75%時，進行擴容。
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75 
f;

    /**
     * HashMap的儲存結構
     */
    transient Entry<K,V>[] table;

    /**
     * HashMap中條目（即鍵-值對）的數量
     */
    transient int size;

    /**
     * HashMap的重構閾值，它的值為容量和負載因子的乘積。在HashMap中所有桶中元素的總數量達到了這個重構閾值之後，HashMap將進行resize操作以自動擴容。
     */
    int threshold;

    /**
     * 負載因子，它和容量一樣都是HashMap擴容的決定性因素。    
     */
    final float loadFactor;

    /**
     * 表示HashMap被結構化更新的次數，比如插入、刪除等會更新HashMap結構的操作次數，用於實現迭代器快速失敗行為。
     */
    transient int modCount;

    /**
     * 預設的閥值 
     */
    static final int ALTERNATIVE_HASHING_THRESHOLD_DEFAULT = Integer.MAX_VALUE;

    /**
     * 表示是否要對字串鍵使用備選雜湊函式      
     */   
     transient boolean useAltHashing;

    /**
     * 一個與當前例項關聯並且可以減少雜湊碰撞概率，應用於鍵的雜湊碼計算的隨機種子。     
     */
    transient final int hashSeed = sun.misc.Hashing.randomHashSeed(this);

構造器：

public HashMap(int initialCapacity, float loadFactor) {
        //校驗初始化容量大小
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        //初始化容量是否大於容量的最大值
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        //校驗載入因子
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        // Find a power of 2 >= initialCapacity
        int capacity = 1;
        //使容量為2的N次方
        while (capacity < initialCapacity)
            capacity <<= 1;

        //載入因子
        this.loadFactor = loadFactor;
        //重構閾值
        threshold = (int)Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];
        //跟hash值的計算相關
        useAltHashing = sun.misc.VM.isBooted() &&
                (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
        init();
    }

四、HashMap的基本操作

PUT方法

 public V put(K key, V value) {
        //如果鍵是NULL，呼叫putForNullKey方法。
        if (key == null)
            return putForNullKey(value);
        //計算hash值
        int hash = hash(key);
        //根據hash值計算桶號
        int i = indexFor(hash, table.length);
        //遍歷該桶中的連結串列
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            //如果其hash值相等且鍵相等，將新值替換舊值，並返回舊值
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        //用於Fail-Fast機制
        modCount++;
        //該桶中沒有存放元素，或者沒有元素的鍵與要PUT元素的鍵匹配，插入新節點
        addEntry(hash, key, value, i);
        return null;
    }

putForNullKey方法：

    private V putForNullKey(V value) {
        //遍歷第一個桶中的連結串列
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            //如果連結串列中有元素的鍵為NULL，將新值替換舊值，並返回舊值
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        //第一個桶中沒有存放元素或沒有節點的鍵為null的，插入新節點
        addEntry(0, null, value, 0);
        return null;
    }

hash方法：

    final int hash(Object k) {
        int h = 0;
        if (useAltHashing) {
            if (k instanceof String) {
                //對字串鍵使用備選雜湊函式
                return sun.misc.Hashing.stringHash32((String) k);
            }
            //隨機種子,用來降低衝突發生的機率
            h = hashSeed;
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        //混合高低位
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

該方法是“擾亂函式”，使高位和低位混合，以此來加大低位的隨機性，避免低位相同而高位不同的兩個數產生衝突。

indexFor方法：

    static int indexFor(int h, int length) {
        //把hash值和陣列的長度進行“與”操作
        return h & (length-1);
    }

該方法用於確定元素存放於陣列的位置，但是引數h是一個由hash方法計算而來的int型別資料，如果直接拿h作為下標訪問HashMap主陣列的話，考慮到2進位制32位帶符號的int值範圍從-2147483648到2147483648，該值可能會很大，所以這個值不能直接使用，要用它對陣列的長度進行取模運算，得到的餘數才能用來當做陣列的下標，這就是indexFor方法做的事情。（因為length總是為2的N次方，所以h & (length-1)操作等價於hash % length操作，但&操作效能更優）

該方法也是HashMap的陣列長度為什麼總是2的N次方的原因。2的N次方 - 1的二進位制碼是一個“低位掩碼”，“與”操作後會把hash值的高位置零，只保留低位的值，使用這種方法使值縮小。以初始長度16為例，16-1=15。2進製表示是00000000 00000000 00001111。和某雜湊值做“與”操作如下，結果就是截取了最低的四位值。

    10100101 11000100 00100101
&   00000000 00000000 00001111
----------------------------------
    00000000 00000000 00000101 //高位全部歸零，只保留末四位

這樣，就算差距很大的兩個數，只要低位相同，那麼就會產生衝突，會對效能造成很大的影響，於是，hash方法的作用就體現出來了。

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        //混合高低位
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);

h ^= k.hashCode(): 0101 1101 0010 1111 1100 0110 0011 0101
------------------------------------------------------------
h >>> 20：       : 0000 0000 0000 0000 0000 0101 1101 0010
h >>> 12         : 0000 0000 0000 0101 1101 0010 1111 1100 
------------------------------------------------------------
(h >>> 20) ^ (h >>> 12)
                 : 0000 0000 0000 0101 1101 0111 0010 1110
-----------------------------------------------------------
h ^= (h >>> 20) ^ (h >>> 12)
                 : 0101 1101 0010 1010 0001 0001 0001 1011
-----------------------------------------------------------
(h >>> 7)        : 0000 0000 1011 1010 0101 0100 0010 0010
(h >>> 4)        : 0000 0101 1101 0010 1010 0001 0001 0001 
-----------------------------------------------------------
(h >>> 7) ^ (h >>> 4)
                 ：0000 0101 0110 1000 1111 0101 0011 0011
-----------------------------------------------------------
h ^ (h >>> 7) ^ (h >>> 4)
                 ：0101 1000 0100 0010 1110 0100 0010 1000
-----------------------------------------------------------
h & (length-1)   ：0000 0000 0000 0000 0000 0000 0000 1000   = 8

就這樣，通過高低位之間進行異以此來加大低位的隨機性，以減少衝突的機率。

addEntry方法：

void addEntry(int hash, K key, V value, int bucketIndex) {
        //如果尺寸已將超過了閾值並且桶中索引處不為null
        if ((size >= threshold) && (null != table[bucketIndex])) {
            //擴容2倍
            resize(2 * table.length);
            //重新計算雜湊值
            hash = (null != key) ? hash(key) : 0;
            //重新計算桶號
            bucketIndex = indexFor(hash, table.length);
        }
        //建立節點
        createEntry(hash, key, value, bucketIndex);
    }

resize方法：

void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        //擴容前的容量已經達到最大容量，將閾值設定為整型的最大值
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
        //建立新容量的陣列
        Entry[] newTable = new Entry[newCapacity];
        boolean oldAltHashing = useAltHashing;
        //計算是否需要對鍵重新進行雜湊碼的計算
        useAltHashing |= sun.misc.VM.isBooted() &&
                (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
        boolean rehash = oldAltHashing ^ useAltHashing;
        /**
         * 將原有所有的桶遷移至新的桶陣列中
         * 在遷移時，桶在桶陣列中的絕對位置可能會發生變化
         * 這就是為什麼HashMap不能保證儲存條目的順序不能恆久不變的原因
         */
        transfer(newTable, rehash);
        table = newTable;
        //重新計算重構閾值
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

transfer方法：

 void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        //遍歷當前的table，將裡面的元素新增到新的newTable中
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    //重新計算hash值
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                //計算桶號
                int i = indexFor(e.hash, newCapacity);
                //插入到連結串列頭部
                e.next = newTable[i];
                //存放在陣列下標i中,所以擴容後連結串列的順序與原來相反
                newTable[i] = e;
                e = next;
            }
        }
    }

createEntry方法：

void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        //把該節點插到連結串列頭部
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

GET方法

    public V get(Object key) {
        //如果鍵為null，呼叫getForNullKey方法
        if (key == null)
            return getForNullKey();
        //鍵不為null，呼叫getEntry方法
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }

getForNullKey方法：

    private V getForNullKey() {
        //遍歷第一個桶中的連結串列，因為putForNullKey是把NULL鍵存放到第一個桶中。
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
    }

getEntry方法：

final Entry<K,V> getEntry(Object key) {

        //計算鍵的hash值
        int hash = (key == null) ? 0 : hash(key);
        //遍歷對應桶中的連結串列
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

REMOVE方法

    public V remove(Object key) {
        Entry<K,V> e = removeEntryForKey(key);
        return (e == null ? null : e.value);
    }

removeEntryForKey方法：

    final Entry<K,V> removeEntryForKey(Object key) {
        //計算鍵的hash值
        int hash = (key == null) ? 0 : hash(key);
        //計算桶號
        int i = indexFor(hash, table.length);
        //記錄待刪除節點的上一個節點
        Entry<K,V> prev = table[i];
        //待刪除節點
        Entry<K,V> e = prev;

        while (e != null) {
            Entry<K,V> next = e.next;
            Object k;
            //是否是將要刪除的節點
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k)))) {
                modCount++;
                size--;
                //將要刪除的節點是否為連結串列的頭部
                if (prev == e)
                    //連結串列的頭部指向下一節點
                    table[i] = next;
                else
                    //上一節點的NEXT為將要刪除節點的下一節點
                    prev.next = next;
                e.recordRemoval(this);
                return e;
            }
            prev = e;
            e = next;
        }

        return e;
    }

基於JDK1.7的HashMap原始碼詳解

一、HashMap概述

二、HashMap的結構

三、構造器與屬性

四、HashMap的基本操作

PUT方法

GET方法

REMOVE方法

基於JDK1.7的HashMap原始碼詳解

《21個專案玩轉深度學習：基於TensorFlow的實踐詳解》PDF+原始碼下載

java集合類原始碼詳解-LinkedList（4）-基於JDK8

基於STM32原子戰艦板記憶體管理原始碼詳解

JDK1.7中HashMap死環問題及JDK1.8中對HashMap的優化原始碼詳解

基於Github搭建SrpingCloudConfig詳解

基於Tomcat的JSP 詳解（1）—— 概述

【轉】JDK的Parser來解析Java原始碼詳解

Map容器家族(HashMap原始碼詳解)

zxing開源庫工作流程原始碼詳解

Collection容器家族(LinkedHashSet原始碼詳解）

Collection容器家族(HashSet原始碼詳解）

Map容器家族(TreeMap原始碼詳解)

Map容器家族(LinkedHashMap原始碼詳解)

21個專案玩轉深度學習：基於TensorFlow的實踐詳解03—打造自己的影象識別模型

OkHttp原始碼詳解之二完結篇

OkHttp原始碼詳解之Okio原始碼詳解

wordcount 原始碼詳解

分享《21個項目玩轉深度學習：基於TensorFlow的實踐詳解》PDF+源代碼

openTSDB原始碼詳解之rowKey生成

基於JDK1.7的HashMap原始碼詳解

一、HashMap概述

二、HashMap的結構

三、構造器與屬性

四、HashMap的基本操作

PUT方法

GET方法

REMOVE方法

相關推薦