HashMap底層陣列長度與位運算
阿新 • • 發佈:2021-01-07
HashMap資料結構
看過jdk中HashMap原始碼的同學都知道他的底層資料結構是陣列+連結串列
並且jdk1.8做了優化,當連結串列長度大於8時 會採用紅黑樹
形如下面兩張結構圖
jdk1.8之前hashmap結構圖
jdk1.8之前hashmap結構圖
這不是這篇文章的重點,我的目的是搞清楚閱讀原始碼一直以來的一個困惑,具體就是底層原始碼中的位運算。
陣列容量初始化
初始化實際發生在第一次put元素,在resize()中完成
1、不指定initialCapacity
//指定預設負載容量,即容量超過3/4時擴容 public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted } static final float DEFAULT_LOAD_FACTOR = 0.75f; //預設初始容量 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 final Node<K,V>[] resize() { ... else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } return newTab; }
2、指定initialCapacity
public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " + loadFactor); this.loadFactor = loadFactor; this.threshold = tableSizeFor(initialCapacity); } //初始化核心方法 static final int tableSizeFor(int cap) { int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; } final Node<K,V>[] resize() { ... int oldThr = threshold; ... else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; ... return newTab; }
分析tableSizeFor方法
cap=0 n=-1 通過位運算 仍然為-1 方法返回1
cap=1 n=0 通過位運算 仍然為-1方法返回也是1
cap>1 n=cap - 1>0 那麼二進位制的 n 至少有一個 bit 為 1
cap = 17 n = cap-1 = 16 原碼=反碼=補碼 = 0001 0000 n |= n >>> 1 n >>> 1 = 0000 1000 n= 0001 0000|0000 1000 = 0001 1000 24 n |= n >>> 2 n >>> 2 = 0000 0110 n= 0001 1000|0000 0110 = 0001 1110 30 n |= n >>> 4 n >>> 4 = 0000 0001 n= 0001 1110|0000 0001 = 0001 1111 31 n |= n >>> 8 n >>> 8 = 0000 0000 n = 00001 1111|0000 0000 = 0001 1111 31 n |= n >>> 16 n >>> 16 = 0000 0000 n = 00001 1111|0000 0000 = 0001 1111 31 返回 n + 1 = 31+1 = 2^4+2^3+2^2+2^1+2^0 +1 = 2^5=32
位運算的目的是為了將第一個位值1後面的所有位都置換為1最終
n = (省略0) 1...111
最終返回 n + 1 符合數學定理
2^n = 2^(n-1)+2^(n-2)+...+2^0 + 1
是>=cap 最小2的n次冪
另外resize方法真正擴容時,採取容量翻倍策略
final Node<K,V>[] resize() {
...
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
//newCap = oldCap << 1 = oldCap x 2
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
...
}
所以HashMap底層陣列初始化和擴容,陣列長度都是2次冪
底層位運算(n - 1) & hash
HashMap底層多個方法用到了這個位運算
final Node<K,V> getNode(int hash, Object key)
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict)
final void treeifyBin(Node<K,V>[] tab, int hash)
final Node<K,V> removeNode(int hash, Object key, Object value,
boolean matchValue, boolean movable)
public V computeIfAbsent(K key,
Function<? super K, ? extends V> mappingFunction)
public V compute(K key,
BiFunction<? super K, ? super V, ? extends V> remappingFunction)
public V merge(K key, V value,
BiFunction<? super V, ? super V, ? extends V> remappingFunction)
final void removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab,
boolean movable)
//類似
int n;
n = tab.length;
int index = (n - 1) & hash;
//或者
n = tab.length;
tab[i = (n - 1) & hash]
目的都是為了確定元素在陣列中的位置,分析(n - 1) & hash
前文可知陣列
n = tab.length = 2^m
n-1 = 2^m -1 = 2^(m-1)+2^(m-2)+...+2^0
二進位制h換算
n-1= (省略0)11111111
n-1&hash
根據按位與運算的特性 即取二進位制hash值的低m位
例如n = 16 = 2^4
n-1 = 16-1 = 2^3+2^2+2^1+2^0 = 0000 1111
(n-1) & hash = 0000 1111 & hash 即取二進位制hash值的低4位
無論hash值是多少
其結果範圍只能是 [0000 0000 , 0000 1111] = [0,15]
正好對應陣列的index
所以HashMap陣列長度是2次冪,將可以很方便的與Key的hash值運算出元素在陣列中的位置,非常巧妙。