[源碼學習]HashMap

阿新 • • 發佈：2017-09-14

targe 循環 logs 註意容量 mis jdk urn 發生

HashMap繼承制AbstractMap，很多通用的方法，比如size()、isEmpty()，都已經在這裏實現了。來看一個比較簡單的方法，get方法：

 1 public V get(Object key) {
 2     Iterator<Entry<K,V>> i = entrySet().iterator();
 3     if (key==null) {
 4         while (i.hasNext()) {
 5             Entry<K,V> e = i.next();
 6             if (e.getKey()==null 
)
 7                 return e.getValue();
 8         }
 9     } else {
10         while (i.hasNext()) {
11             Entry<K,V> e = i.next();
12             if (key.equals(e.getKey()))
13                 return e.getValue();
14         }
15     }
16     return null;
17 }

　　單從這裏看看不到Map的搜索策略，這裏顯示的僅僅就是遍歷全部元素挨個測試是否匹配。

　　remove方法中先匹配到元素，然後利用叠代器Iterator的remove方法將元素從記錄中刪除。

 1     public V remove(Object key) {
 2         Iterator<Entry<K,V>> i = entrySet().iterator();
 3         Entry<K,V> correctEntry = null;
 4         if (key==null) {
 5             while (correctEntry==null && i.hasNext()) {
 
 6                 Entry<K,V> e = i.next();
 7                 if (e.getKey()==null)
 8                     correctEntry = e;
 9             }
10         } else {
11             while (correctEntry==null && i.hasNext()) {
12                 Entry<K,V> e = i.next();
13                 if (key.equals(e.getKey()))
14                     correctEntry = e;
15             }
16         }
17 
18         V oldValue = null;
19         if (correctEntry !=null) {
20             oldValue = correctEntry.getValue();
21             i.remove();
22         }
23         return oldValue;
24     }

終於看到存儲key和value的地方了，這裏馬上出現了兩個Java關鍵字，transient和volatile：

    transient volatile Set<K>        keySet = null;
    transient volatile Collection<V> values = null;

transient關鍵字的意思是說改字段不會被持久化和反持久化，這個會在對象序列化到文件時用到。參考這裏。

volatile就比較復雜一點兒了，一旦一個共享變量（類的成員變量、類的靜態成員變量）被volatile修飾之後，那麽就具備了兩層語義：

　1）保證了不同線程對這個變量進行操作時的可見性，即一個線程修改了某個變量的值，這新值對其他線程來說是立即可見的。

　2）禁止進行指令重排序。

參考：http://www.cnblogs.com/dolphin0520/p/3920373.html

有點兒線程安全的意思，就是說一個變量被另外一個線程修改了，其他在使用這個變量的線程也會知道。

文章中所舉例子：

1 //線程1
2 boolean stop = false;
3 while(!stop){
4     doSomething();
5 }
6  
7 //線程2
8 stop = true

以上代碼是有可能死循環的。

接下來初始化Entry時，如果用一個Map去初始化另外一個Map，那麽這個Map的初始大小將為原先Map的2倍：

 1     public HashMap(Map<? extends K, ? extends V> m) {
 2         this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,
 3                       DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);
 4         inflateTable(threshold);
 5 
 6         putAllForCreate(m);
 7     }
 8 
 9     private static int roundUpToPowerOf2(int number) {
10         // assert number >= 0 : "number must be non-negative";
11         return number >= MAXIMUM_CAPACITY
12                 ? MAXIMUM_CAPACITY
13                 : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
14     }
15 
16     /**
17      * Inflates the table.
18      */
19     private void inflateTable(int toSize) {
20         // Find a power of 2 >= toSize
21         int capacity = roundUpToPowerOf2(toSize);
22 
23         threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
24         table = new Entry[capacity];
25         initHashSeedAsNeeded(capacity);
26     }

拿到原先Map的size之後初始化一個新的Entry數組，這個數組的的size增加到原先Map的2倍。方法Integer.highestOneBit(num)的作用是得到比num還大的但是是2的指數倍的數。這些數其實就是2,4,8,16,32,64,128,256,512

可以看出JDK源碼中很多地方對於*2這種操作都不是直接乘以2，而是采用向左位移一位，比如：

1 (number - 1) << 1

HashMap數據結構

插入元素

1、key為null的Entry存放在數組Entry[]的第一位的Entry鏈表中，即Entry[0]，仔細看看，Map.put()方法其實是有返回值的，這個返回值就是被替換掉的Value（如果存在的話）。

2、key不為空，通過Hash散列之後存入數組不同位置的鏈表中。散列中用到了按位與（&）運算：

1     /**
2      * Returns index for hash code h.
3      */
4     static int indexFor(int h, int length) {
5         // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
6         return h & (length-1);
7     }

如何進行Hash散列

按位與運算規則如下：

0 & 0 = 1

0 & 1 = 0

1 & 0 = 0

1 & 1 = 1

也就是只有同時兩個都為1時才等於1。這裏要求數組的長度必須是2指數倍是有原因的。比如length = 256（2的8次方），那麽它換算成二進制就是1後面8個0：

100000000

256 - 1 換算成二進制剛好是7個1：

用這個數與任意的數N進行按位與運算的效果是：保留N的後7位：

         1 1 1 1 1 1 1
&    1 0 1 0 0 1 0 1 0
-------------------------------
     0 0 1 0 0 1 0 1 0

後面這7位就像亮著的幾盞燈，亮幾盞就能截取多少位。正是這樣實現了數據的Hash散列。

從上面的代碼可以看出，Hash散列時僅用到了了Object.hashCode()的後幾位，如果n - 1 = 15即0x1111,那麽發送Hash沖突的可能性會非常大，為了解決這個問題，可以理解為需要在原先的Object.hashClde()基礎之上做一些混淆，即使整個原始HashCode都會影響最終的散列。

 1     final int hash(Object k) {
 2         int h = hashSeed;
 3         if (0 != h && k instanceof String) {
 4             return sun.misc.Hashing.stringHash32((String) k);
 5         }
 6 
 7         h ^= k.hashCode();
 8 
 9         // This function ensures that hashCodes that differ only by
10         // constant multiples at each bit position have a bounded
11         // number of collisions (approximately 8 at default load factor).
12         h ^= (h >>> 20) ^ (h >>> 12);
13         return h ^ (h >>> 7) ^ (h >>> 4);
14     }

這裏通過高位與低位（向右位移的距離不一樣）的異或運算進行“混淆”。

總結

HashMap的實現原理?

通過元素的哈希碼來做映射，將數據散列到一個數組中，如果發生了哈希沖突則將沖突的元素形成一個鏈表進行存儲。Java8中進行了優化，沖突的元素多到一定程度時，將改鏈表為紅黑樹，這樣有效提高了高沖突時的性能；

HashMap需要註意些什麽?

註意兩個參數：

容量（Capacity）：容器也可以叫做數組的初始大小，如果元素增加到一定程度（也就是負載因子），就會將容量翻倍。

負載因子（Load factor）：默認負載因子是0.75，也就是當元素超過四分之三的時候會增加數組的大小。

需要註意的是，如果你想時HashMap遍歷得更快，應該把容量設計得小點兒、負載因子設計大點兒，這樣其實是讓HashMap的數組存儲地更密集些，能提高遍歷速度。

[源碼學習]HashMap

targe 循環 logs 註意容量 mis jdk urn 發生 HashMap繼承制AbstractMap，很多通用的方法，比如size()、isEmpty()，都已經在這裏實現了。來看一個比較簡單的方法，get方法： 1 public V get(Object

[源碼學習]HashMap

[源碼學習]HashMap

HashMap集合在遍歷顯示源碼學習

【源碼學習】之requirejs

【Spark2.0源碼學習】-6.Client啟動

async源碼學習 - waterfall函數的使用及原理實現

【Spark2.0源碼學習】-10.Task執行與回饋

[Android FrameWork 6.0源碼學習] ViewGroup的addView函數分析

java源碼學習（四）ArrayList

Hadoop源碼學習之HDFS（一）

lodash源碼學習(2)

[Android FrameWork 6.0源碼學習] View的重繪過程

tomcat源碼學習一：導入eclipse

shiro的源碼學習（四）-- 深入理解realm

【spring源碼學習】spring的IOC容器之BeanFactoryPostProcessor接口學習

【spring源碼學習】spring的AOP面向切面編程的實現解析

Spring 源碼學習（一）

[Android FrameWork 6.0源碼學習] View的重繪過程之Layout

Guava源碼學習（五）EventBus

【spring源碼學習】spring的遠程調用實現源碼分析

Guice源碼學習（一）基本原理

[源碼學習]HashMap

相關推薦