JDK1.8逐字逐句帶你理解ConcurrentHashMap(2)
引言:
在上一篇博文我們介紹了ConcurrentHashMap在jdk1.8中所必要的知識,作為基礎入門。因為jdk1.8的ConcurrentHashMap做了太多的變動,所以新知識學習是必要的。今天是ConcurrentHashMap的第二篇,第二篇主要是認識ConcurrentHashMap,我將會介紹一下它的關鍵成員變數和一些關鍵的類。大家可以結合前幾篇博文的HashMap的知識,很多還是很相似的。筆者目前整理的一些blog針對面試都是超高頻出現的。大家可以點選連結:http://blog.csdn.net/u012403290
如何實現執行緒安全:
我們都知道ConcurrentHashMap核心是執行緒安全的,那麼它又是用什麼來實現執行緒安全的呢?在jdk1.8中主要是採用了CAS演算法實現執行緒安全的。在上一篇博文中已經介紹了CAS的無鎖操作,這裡不再贅述。同時它通過CAS演算法又實現了3種原子操作(執行緒安全的保障就是操作具有原子性),下面我賦值了原始碼分別表示哪些成員變數採用了CAS演算法,然後又是哪些方法實現了操作的原子性:
// Unsafe mechanics CAS保障了哪些成員變數操作是原子性的
private static final sun.misc.Unsafe U;
private static final long LOCKSTATE;
static {
try {
U = sun.misc.Unsafe.getUnsafe();
Class<?> k = TreeBin.class; //操作TreeBin,後面會介紹這個類
LOCKSTATE = U.objectFieldOffset
(k.getDeclaredField("lockState" ));
} catch (Exception e) {
throw new Error(e);
}
}
--------------------------------------------------------------------------------------
private static final sun.misc.Unsafe U;
private static final long SIZECTL;
private static final long TRANSFERINDEX;
private static final long BASECOUNT;
private static final long CELLSBUSY;
private static final long CELLVALUE;
private static final long ABASE;
private static final int ASHIFT;
static {
try {
//以下變數會在下面介紹到
U = sun.misc.Unsafe.getUnsafe();
Class<?> k = ConcurrentHashMap.class;
SIZECTL = U.objectFieldOffset
(k.getDeclaredField("sizeCtl"));
TRANSFERINDEX = U.objectFieldOffset
(k.getDeclaredField("transferIndex"));
BASECOUNT = U.objectFieldOffset
(k.getDeclaredField("baseCount"));
CELLSBUSY = U.objectFieldOffset
(k.getDeclaredField("cellsBusy"));
Class<?> ck = CounterCell.class;
CELLVALUE = U.objectFieldOffset
(ck.getDeclaredField("value"));
Class<?> ak = Node[].class;
ABASE = U.arrayBaseOffset(ak);
int scale = U.arrayIndexScale(ak);
if ((scale & (scale - 1)) != 0)
throw new Error("data type scale not a power of two");
ASHIFT = 31 - Integer.numberOfLeadingZeros(scale);
} catch (Exception e) {
throw new Error(e);
}
}
//3個原子性操作方法:
/* ---------------- Table element access -------------- */
/*
* Volatile access methods are used for table elements as well as
* elements of in-progress next table while resizing. All uses of
* the tab arguments must be null checked by callers. All callers
* also paranoically precheck that tab's length is not zero (or an
* equivalent check), thus ensuring that any index argument taking
* the form of a hash value anded with (length - 1) is a valid
* index. Note that, to be correct wrt arbitrary concurrency
* errors by users, these checks must operate on local variables,
* which accounts for some odd-looking inline assignments below.
* Note that calls to setTabAt always occur within locked regions,
* and so in principle require only release ordering, not
* full volatile semantics, but are currently coded as volatile
* writes to be conservative.
*/
@SuppressWarnings("unchecked")
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
}
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
Node<K,V> c, Node<K,V> v) {
return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
}
以上這些基本實現了執行緒安全,還有一點是jdk1.8優化的結果,在以前的ConcurrentHashMap中是鎖定了Segment,而在jdk1.8被移除,現在鎖定的是一個Node頭節點(注意,synchronized鎖定的是頭結點,這一點從下面的原始碼中就可以看出來),減小了鎖的粒度,效能和衝突都會減少,以下是原始碼中的體現:
//這段程式碼其實是在擴容階段對頭節點的鎖定,其實還有很多地方不一一列舉。
synchronized (f) {
if (tabAt(tab, i) == f) {
Node<K,V> ln, hn;
if (fh >= 0) {
int runBit = fh & n;
Node<K,V> lastRun = f;
for (Node<K,V> p = f.next; p != null; p = p.next) {
int b = p.hash & n;
if (b != runBit) {
runBit = b;
lastRun = p;
}
}
if (runBit == 0) {
ln = lastRun;
hn = null;
}
else {
hn = lastRun;
ln = null;
}
for (Node<K,V> p = f; p != lastRun; p = p.next) {
int ph = p.hash; K pk = p.key; V pv = p.val;
if ((ph & n) == 0)
ln = new Node<K,V>(ph, pk, pv, ln);
else
hn = new Node<K,V>(ph, pk, pv, hn);
}
setTabAt(nextTab, i, ln);
setTabAt(nextTab, i + n, hn);
setTabAt(tab, i, fwd);
advance = true;
}
else if (f instanceof TreeBin) {
.....
}
}
如何儲存資料:
知道了ConcurrentHashMap是如何實現執行緒安全的同時,最起碼我們還要知道ConcurrentHashMap又是怎麼實現資料儲存的。以下是儲存的圖:
有人看了之後會想,這個不是HashMap的儲存結構麼?在jdk1.8中取消了segment,所以結構其實和HashMap是極其相似的,在HashMap的基礎上實現了執行緒安全,同時在每一個“桶”中的節點會被鎖定。
重要的成員變數:
1、capacity:容量,表示目前map的儲存大小,在原始碼中分為預設和最大,預設是在沒有指定容量大小的時候會賦予這個值,最大表示當容量達到這個值時,不再支援擴容。
/**
* The largest possible table capacity. This value must be
* exactly 1<<30 to stay within Java array allocation and indexing
* bounds for power of two table sizes, and is further required
* because the top two bits of 32bit hash fields are used for
* control purposes.
*/
private static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The default initial table capacity. Must be a power of 2
* (i.e., at least 1) and at most MAXIMUM_CAPACITY.
*/
private static final int DEFAULT_CAPACITY = 16;
2、laodfactor:載入因子,這個和HashMap是一樣的,預設值也是0.75f。有不清楚的可以去尋找上篇介紹HashMap的博文。
/**
* The load factor for this table. Overrides of this value in
* constructors affect only the initial table capacity. The
* actual floating point value isn't normally used -- it is
* simpler to use expressions such as {@code n - (n >>> 2)} for
* the associated resizing threshold.
*/
private static final float LOAD_FACTOR = 0.75f;
3、TREEIFY_THRESHOLD與UNTREEIFY_THRESHOLD:作為了解,這個兩個主要是控制連結串列和紅黑樹轉化的,前者表示大於這個值,需要把連結串列轉換為紅黑樹,後者表示如果紅黑樹的節點小於這個值需要重新轉化為連結串列。關於為什麼要把連結串列轉化為紅黑樹,在HashMap的介紹中,我已經詳細解釋過了。
/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2, and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8;
/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
*/
static final int UNTREEIFY_THRESHOLD = 6;
4、下面3個引數作為了解,主要是在擴容和參與擴容(當執行緒進入put的時候,發現該map正在擴容,那麼它會協助擴容)的時候使用,在下一篇博文中會簡單介紹到。
/**
* The number of bits used for generation stamp in sizeCtl.
* Must be at least 6 for 32bit arrays.
*/
private static int RESIZE_STAMP_BITS = 16;
/**
* The maximum number of threads that can help resize.
* Must fit in 32 - RESIZE_STAMP_BITS bits.
*/
private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;
/**
* The bit shift for recording size stamp in sizeCtl.
*/
private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
5、下面2個欄位比較重要,是執行緒判斷map當前處於什麼階段。MOVED表示該節點是個forwarding Node,表示有執行緒處理過了。後者表示判斷到這個節點是一個樹節點。
static final int MOVED = -1; // hash for forwarding nodes
static final int TREEBIN = -2; // hash for roots of trees
6、sizeCtl,標誌控制符。這個引數非常重要,出現在ConcurrentHashMap的各個階段,不同的值也表示不同情況和不同功能:
①負數代表正在進行初始化或擴容操作
②-N 表示有N-1個執行緒正在進行擴容操作 (前面已經說過了,當執行緒進行值新增的時候判斷到正在擴容,它就會協助擴容)
③正數或0代表hash表還沒有被初始化,這個數值表示初始化或下一次進行擴容的大小,類似於擴容閾值。它的值始終是當前ConcurrentHashMap容量的0.75倍,這與loadfactor是對應的。實際容量>=sizeCtl,則擴容。
注意:在某些情況下,這個值就相當於HashMap中的threshold閥值。用於控制擴容。
極其重要的幾個內部類:
如果要理解ConcurrentHashMap的底層,必須要了解它相關聯的一些內部類。
1、Node
/**
* Key-value entry. This class is never exported out as a
* user-mutable Map.Entry (i.e., one supporting setValue; see
* MapEntry below), but can be used for read-only traversals used
* in bulk tasks. Subclasses of Node with a negative hash field
* are special, and contain null keys and values (but are never
* exported). Otherwise, keys and vals are never null.
*/
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
volatile V val; //用volatile修飾
volatile Node<K,V> next;//用volatile修飾
Node(int hash, K key, V val, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.val = val;
this.next = next;
}
public final K getKey() { return key; }
public final V getValue() { return val; }
public final int hashCode() { return key.hashCode() ^ val.hashCode(); }
public final String toString(){ return key + "=" + val; }
public final V setValue(V value) {
throw new UnsupportedOperationException(); //不可以直接setValue
}
public final boolean equals(Object o) {
Object k, v, u; Map.Entry<?,?> e;
return ((o instanceof Map.Entry) &&
(k = (e = (Map.Entry<?,?>)o).getKey()) != null &&
(v = e.getValue()) != null &&
(k == key || k.equals(key)) &&
(v == (u = val) || v.equals(u)));
}
/**
* Virtualized support for map.get(); overridden in subclasses.
*/
Node<K,V> find(int h, Object k) {
Node<K,V> e = this;
if (k != null) {
do {
K ek;
if (e.hash == h &&
((ek = e.key) == k || (ek != null && k.equals(ek))))
return e;
} while ((e = e.next) != null);
}
return null;
}
}
從上面的Node內部類原始碼可以看出,它的value 和 next是用volatile修飾的,關於volatile已經在前面一篇博文介紹過,使得value和next具有可見性和有序性,從而保證執行緒安全。同時大家仔細看過程式碼就會發現setValue()方法訪問是會丟擲異常,是禁止用該方法直接設定value值的。同時它還錯了一個find的方法,該方法主要是使用者尋找某一個節點。
2、TreeNode和TreeBin
/**
* Nodes for use in TreeBins
*/
static final class TreeNode<K,V> extends Node<K,V> {
TreeNode<K,V> parent; // red-black tree links
TreeNode<K,V> left;
TreeNode<K,V> right;
TreeNode<K,V> prev; // needed to unlink next upon deletion
boolean red;
TreeNode(int hash, K key, V val, Node<K,V> next,
TreeNode<K,V> parent) {
super(hash, key, val, next);
this.parent = parent;
}
Node<K,V> find(int h, Object k) {
return findTreeNode(h, k, null);
}
/**
* Returns the TreeNode (or null if not found) for the given key
* starting at given root.
*/
final TreeNode<K,V> findTreeNode(int h, Object k, Class<?> kc) {
if (k != null) {
TreeNode<K,V> p = this;
do {
int ph, dir; K pk; TreeNode<K,V> q;
TreeNode<K,V> pl = p.left, pr = p.right;
if ((ph = p.hash) > h)
p = pl;
else if (ph < h)
p = pr;
else if ((pk = p.key) == k || (pk != null && k.equals(pk)))
return p;
else if (pl == null)
p = pr;
else if (pr == null)
p = pl;
else if ((kc != null ||
(kc = comparableClassFor(k)) != null) &&
(dir = compareComparables(kc, k, pk)) != 0)
p = (dir < 0) ? pl : pr;
else if ((q = pr.findTreeNode(h, k, kc)) != null)
return q;
else
p = pl;
} while (p != null);
}
return null;
}
}
//TreeBin太長,筆者截取了它的構造方法:
TreeBin(TreeNode<K,V> b) {
super(TREEBIN, null, null, null);
this.first = b;
TreeNode<K,V> r = null;
for (TreeNode<K,V> x = b, next; x != null; x = next) {
next = (TreeNode<K,V>)x.next;
x.left = x.right = null;
if (r == null) {
x.parent = null;
x.red = false;
r = x;
}
else {
K k = x.key;
int h = x.hash;
Class<?> kc = null;
for (TreeNode<K,V> p = r;;) {
int dir, ph;
K pk = p.key;
if ((ph = p.hash) > h)
dir = -1;
else if (ph < h)
dir = 1;
else if ((kc == null &&
(kc = comparableClassFor(k)) == null) ||
(dir = compareComparables(kc, k, pk)) == 0)
dir = tieBreakOrder(k, pk);
TreeNode<K,V> xp = p;
if ((p = (dir <= 0) ? p.left : p.right) == null) {
x.parent = xp;
if (dir <= 0)
xp.left = x;
else
xp.right = x;
r = balanceInsertion(r, x);
break;
}
}
}
}
this.root = r;
assert checkInvariants(root);
}
從上面的原始碼可以看出,在ConcurrentHashMap中不是直接儲存TreeNode來實現的,而是用TreeBin來包裝TreeNode來實現的。也就是說在實際的ConcurrentHashMap桶中,存放的是TreeBin物件,而不是TreeNode物件。之所以TreeNode繼承自Node是為了附帶next指標,而這個next指標可以在TreeBin中尋找下一個TreeNode,這裡也是與HashMap之間比較大的區別。
3、ForwordingNode
/**
* A node inserted at head of bins during transfer operations.
*/
static final class ForwardingNode<K,V> extends Node<K,V> {
final Node<K,V>[] nextTable;
ForwardingNode(Node<K,V>[] tab) {
super(MOVED, null, null, null);
this.nextTable = tab;
}
Node<K,V> find(int h, Object k) {
// loop to avoid arbitrarily deep recursion on forwarding nodes
outer: for (Node<K,V>[] tab = nextTable;;) {
Node<K,V> e; int n;
if (k == null || tab == null || (n = tab.length) == 0 ||
(e = tabAt(tab, (n - 1) & h)) == null)
return null;
for (;;) {
int eh; K ek;
if ((eh = e.hash) == h &&
((ek = e.key) == k || (ek != null && k.equals(ek))))
return e;
if (eh < 0) {
if (e instanceof ForwardingNode) {
tab = ((ForwardingNode<K,V>)e).nextTable;
continue outer;
}
else
return e.find(h, k);
}
if ((e = e.next) == null)
return null;
}
}
}
}
這個靜態內部內就顯得獨具匠心,它的使用主要是在擴容階段,它是連結兩個table的節點類,有一個next屬性用於指向下一個table,注意要理解這個table,它並不是說有2個table,而是在擴容的時候當執行緒讀取到這個地方發現這個地方為空,這會設定為forwordingNode,或者執行緒處理完該節點也會設定該節點為forwordingNode,別的執行緒發現這個forwordingNode會繼續向後執行遍歷,這樣一來就很好的解決了多執行緒安全的問題。這裡有小夥伴就會問,那一個執行緒開始處理這個節點還沒處理完,別的執行緒進來怎麼辦,而且這個節點還不是forwordingNode吶?說明你前面沒看詳細,在處理某個節點(桶裡面第一個節點)的時候會對該節點上鎖,上面文章中我已經說過了。
認識階段就寫到這裡,對這些東西有一定的瞭解,在下一篇,也就是尾篇中,我會逐字逐句來介紹transfer()擴容,put()新增和get()查詢三個方法。
如果博文存在什麼問題,或者有什麼想法,可以聯絡我呀,下面是我的微信二維碼: