.net原始碼分析 – Dictionary泛型

阿新 • • 發佈：2018-12-01

Dictionary<TKey, TValue>原始碼地址：https://github.com/dotnet/corefx/blob/master/src/System.Collections/src/System/Collections/Generic/Dictionary.cs

介面

Dictionary<TKey, TValue>和List<T>的介面形式差不多，不重複說了，可以參考List<T>那篇。
在這裡插入圖片描述

變數

看下有哪些成員變數：

private int[] buckets;
private Entry[] entries;
private 
 int count;
private int version;

private int freeList;
private int freeCount;
private IEqualityComparer<TKey> comparer;
private KeyCollection keys;
private ValueCollection values;
private Object _syncRoot;

buckets是一個int型陣列，具體什麼用現在還未知，後面看，暫時可以理解成區，像硬碟我們一般會做分割槽歸類方便查詢。

entries是Entry陣列，看看Entry

private struct Entry
{
    public int hashCode;    // Lower 31 bits of hash code, -1 if unused
    public int next;        // Index of next entry, -1 if last
    public TKey key;           // Key of entry
    public TValue value;         // Value of entry
}

是個結構，裡面有key, value, 說明我們Dictionary的key和value

就是用這個結構儲存的，另外還有hashcode和next，看起來像連結串列一樣，後面用到時再具體分析其用處。

count：和List <T>一樣，是指包括元素的個數（這裡其實也不是真正的個數，下面會講），並不是容量

version: List <T>篇講過，用來遍歷時禁止修改集合

freeList, freeCount這兩個看起來比較奇怪，比較難想到會有什麼用，在新增和刪除項時會用到它們，後面再講。

comparer: key的比較物件，可以用它來獲取hashcode以及進行比較key是否相同

keys, values這個我們平常也有用到，遍歷keys或values有用

_syncRoot，List<T>篇也講過，執行緒安全方面的，Dictionary同樣沒有用到這個物件，Dictionary也不是執行緒安全的，在多執行緒環境下使用需要自己加鎖。
例子

Dictionary的程式碼比List相對複雜些，下面不直接分析原始碼，而是以下面這些常用例子來一步一步展示Dictionary是怎麼工作的：

Dictionary<string, string> dict = new Dictionary<string, string>();

dict.Add("a", "A");

dict.Add("b", "B");

dict.Add("c", "C");

dict["d"] = "D";

dict["a"] = "AA";

dict.remove("b");

dict.Add("e", "E");

var a = dict["a"];

var hasA = dict.ContainsKey("a");

這裡對hashcode做些假設，方便分析：

"a"的hashcode為3

"b"的hashcode為4

"c"的hashcode為6

"d"的hashcode為11

"e"的hashcode為10
建構函式

先看第一句，new 一個Dictionary<string, string>，看原始碼裡的建構函式，有6個

public Dictionary() : this(0, null) { }

public Dictionary(int capacity) : this(capacity, null) { }

public Dictionary(IEqualityComparer<TKey> comparer) : this(0, comparer) { }

public Dictionary(int capacity, IEqualityComparer<TKey> comparer)
{
    if (capacity < 0) throw new ArgumentOutOfRangeException(nameof(capacity), capacity, "");
    if (capacity > 0) Initialize(capacity);
    this.comparer = comparer ?? EqualityComparer<TKey>.Default;
}

public Dictionary(IDictionary<TKey, TValue> dictionary) : this(dictionary, null) { }

public Dictionary(IDictionary<TKey, TValue> dictionary, IEqualityComparer<TKey> comparer) :
    this(dictionary != null ? dictionary.Count : 0, comparer)
{
    if (dictionary == null)
    {
        throw new ArgumentNullException(nameof(dictionary));
    }
    if (dictionary.GetType() == typeof(Dictionary<TKey, TValue>))
    {
        Dictionary<TKey, TValue> d = (Dictionary<TKey, TValue>)dictionary;
        int count = d.count;
        Entry[] entries = d.entries;
        for (int i = 0; i < count; i++)
        {
            if (entries[i].hashCode >= 0)
            {
                Add(entries[i].key, entries[i].value);
            }
        }
        return;
    }

    foreach (KeyValuePair<TKey, TValue> pair in dictionary)
    {
        Add(pair.Key, pair.Value);
    }
}

大部分都是用預設值，真正用到的是public Dictionary(int capacity, IEqualityComparer<TKey> comparer)，這個是每個建構函式都要呼叫的，看看它做了什麼：

if (capacity > 0) Initialize(capacity); 當capacity大於0時，也就是顯示指定了capacity時才會呼叫初始化函式，capacity指容量，List<T>裡也有說過，不同的是Dictionary只能在建構函式裡指定capacity，而List<T>可以隨時指定。接下來看看初始化函式做了什麼：

private void Initialize(int capacity)
{
    int size = HashHelpers.GetPrime(capacity);
    buckets = new int[size];
    for (int i = 0; i < buckets.Length; i++) buckets[i] = -1;
    entries = new Entry[size];
    freeList = -1;
}

HashHelpers.GetPrime(capacity)根據傳進來的capacity獲取一個質數，質數大家都知道 2，3，5，7，11，13等等除了自身和1，不能被其他數整除的就是質數，具體看看這個獲取質數的函式：

public static readonly int[] primes = {
    3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919,
    1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591,
    17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437,
    187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263,
    1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369, 8639249, 10367101,
    12440537, 14928671, 17914409, 21497293, 25796759, 30956117, 37147349, 44576837, 53492207, 64190669,
    77028803, 92434613, 110921543, 133105859, 159727031, 191672443, 230006941, 276008387, 331210079,
    397452101, 476942527, 572331049, 686797261, 824156741, 988988137, 1186785773, 1424142949, 1708971541,
    2050765853, MaxPrimeArrayLength };
        
public static int GetPrime(int min)
{
    if (min < 0)
        throw new ArgumentException("");
    Contract.EndContractBlock();

    for (int i = 0; i < primes.Length; i++)
    {
        int prime = primes[i];
        if (prime >= min) return prime;
    }

    return min;
}

這裡維護了個質數陣列，注意，裡面並不是完整的質數序列，而是有一些過濾掉了，因為有些挨著太緊，比方說2和3，增加一個就要擴容很沒必要。

GetPrime看if (prime >= min) return prime;這行程式碼知道是要獲取第一個比傳進來的值大的質數，比方傳的是1，那3就是獲取到的初始容量。

接著看初始化部分的程式碼：size現在知道是3，接下來以這個size來初始化buckets和entries，並且buckets裡的元素都設為-1，freeList同樣初始化成-1，這個後面有用。

初始化完後再呼叫這行程式碼： this.comparer = comparer ?? EqualityComparer<TKey>.Default; 也是初始化comparer，看EqualityComparer<TKey>.Default這個到底用的是什麼：

public static EqualityComparer<T> Default
{
    get
    {
        if (_default == null)
        {
            object comparer;
                    
            if (typeof(T) == typeof(SByte))
                comparer = new EqualityComparerForSByte();
            else if (typeof(T) == typeof(Byte))
                comparer = new EqualityComparerForByte();
            else if (typeof(T) == typeof(Int16))
                comparer = new EqualityComparerForInt16();
            else if (typeof(T) == typeof(UInt16))
                comparer = new EqualityComparerForUInt16();
            else if (typeof(T) == typeof(Int32))
                comparer = new EqualityComparerForInt32();
            else if (typeof(T) == typeof(UInt32))
                comparer = new EqualityComparerForUInt32();
            else if (typeof(T) == typeof(Int64))
                comparer = new EqualityComparerForInt64();
            else if (typeof(T) == typeof(UInt64))
                comparer = new EqualityComparerForUInt64();
            else if (typeof(T) == typeof(IntPtr))
                comparer = new EqualityComparerForIntPtr();
            else if (typeof(T) == typeof(UIntPtr))
                comparer = new EqualityComparerForUIntPtr();
            else if (typeof(T) == typeof(Single))
                comparer = new EqualityComparerForSingle();
            else if (typeof(T) == typeof(Double))
                comparer = new EqualityComparerForDouble();
            else if (typeof(T) == typeof(Decimal))
                comparer = new EqualityComparerForDecimal();
            else if (typeof(T) == typeof(String))
                comparer = new EqualityComparerForString();
            else
                comparer = new LastResortEqualityComparer<T>();

            _default = (EqualityComparer<T>)comparer;
        }

        return _default;
    }
}

為不同型別建立一個comparer，看下面程式碼是我們用到的string的comparer：hashcode直接取的string的hashcode，其實這裡面的所有型別取hashcode都是一樣，equals則有個別不同。

internal sealed class EqualityComparerForString : EqualityComparer<String>
{
    public override bool Equals(String x, String y)
    {
        return x == y;
    }

    public override int GetHashCode(String x)
    {
        if (x == null)
            return 0;
        return x.GetHashCode();
    }
}

基本建構函式就這些，還有個建構函式可以傳一個IDictionary<TKey, TValue>進來，和List<T>一樣，也是初始化就加入這些集合，首先判斷是否是Dictionary，是的話直接遍歷它的entries，加到當前的entries裡，如果不是則用列舉器遍歷。

為什麼不直接用列舉器呢，因為列舉器也是要消耗一些資源的，而且沒有直接遍歷陣列來得快。

這個建構函式新增時用到了Add方法，和例子裡Add一樣，正好是接下來要講的。

`Add("a", "A")`

下圖就是初始變數的狀態：
在這裡插入圖片描述

Add方法直接呼叫Insert方法，第三個引數為true

public void Add(TKey key, TValue value)
{
    Insert(key, value, true);
}

再看Insert方法，這個方法是核心方法，有點長，跟著註釋一點一點看。

private void Insert(TKey key, TValue value, bool add)
{
    if (key == null)
    {
        throw new ArgumentNullException(nameof(key));
    }
    //首先如果buckets為空則初始化，第一次呼叫會走到這裡，以0為capacity初始化，根據上面的分析，獲得的初始容量是3，也就是說3是Dictionary<Tkey, TValue>的預設容量。
    if (buckets == null) Initialize(0); 

    //取hashcode後還與0x7FFFFFFF做了個與操作，0x7FFFFFFF這就是int32.MaxValue的16進位制，換成二進位制是01111111111111111111111111111111，第1位是符號位，也就是說comparer.GetHashCode(key) 為正數的情況下與0x7FFFFFFF做 & 操作結果還是它本身，如果取到的hashcode是負數，負數的二進位制是取反再補碼，所以結果得到的是0x7FFFFFFF-(-hashcode)+1，結果是正數。其實簡單來說，它的目的就是高效能的取正數。
    int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;

    //用得到的新hashcode與buckets的大小取餘，得到一個目標bucket索引
    int targetBucket = hashCode % buckets.Length;

    //做個遍歷，初始值為buckets[targetBucket]，現在"a"的hashcode為3，這樣targetBucket現在是0，buckets[0]是-1，i是要>=0的，迴圈走不下去，跳出
    for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next)
    {
        if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key))
        {
            if (add)
            {
                throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, key));
            }
            entries[i].value = value;
            version++;
            return;
        }
    }

    int index;
    //freeCount也是-1，走到else裡面
    if (freeCount > 0)
    {
        index = freeList;
        freeList = entries[index].next;
        freeCount--;
    }
    else
    {
        //count是元素的個數0， entries經過初始化後目前length是3，所以不用resize
        if (count == entries.Length)
        {
            Resize();
            targetBucket = hashCode % buckets.Length;
        }
        //index = count說明index指向entries數組裡當前要寫值的索引，目前是0
        index = count;

        //元素個數增加一個
        count++;
    }

    //把key的hashcode存到entries[0]裡的hashcode，免得要用時重複計算hashcode
    entries[index].hashCode = hashCode;
    //entries[0]的next指向buckets[0]也就是-1
    entries[index].next = buckets[targetBucket];
    //設定key和value
    entries[index].key = key;
    entries[index].value = value;
    //再讓buckets[0] = 0
    buckets[targetBucket] = index;
    //這個不多說，不知道的可以看List<T>篇
    version++;
}

看到這裡可以先猜一下用bucket的目的，dictionary是為了根據key快速得到value，用key的hashcode來對長度取餘，取到的餘是0到(length-1)之前一個數，最好的情況全部分散開，每個key正好對應一個bucket，也就是entries裡每一項都對應一個bucket，就可以形成下圖取value的過程：
在這裡插入圖片描述
這個取值過程非常快，因為沒有任何遍歷。但實際情況是hashcode取的餘不會正好都不同，總有可能會有一些重複的，那這些重複的是怎麼處理的呢，還是先繼續看Insert的程式碼：

變數狀態如下圖：
在這裡插入圖片描述
從這圖可以看出來是由hashcode得到bucket的index(紫色線)，而bucket的value是指向entry的index(黃色線), entry的next又指向bucket上一次的value(紅色線)，是不是有連結串列的感覺。

`Add("b", "B")`

由於"b"的hashcode為4，取餘得1，並沒有和現有的重複，所以流程和上面一樣（左邊的線不用看，屬於上面流程）

`Add("c", "C")`

"c"的hashcode是6，取餘得0，得到也是在第0個bucket，這樣就產生碰撞了，

for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next)
{
    if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key))
    {
        if (add)
        {
            throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, key));
        }
        entries[i].value = value;
        version++;

 
 
              
           
              
              
            
            相關推薦
			   
            
            
            
 

    

    
    .net原始碼分析 – Dictionary泛型
       
  
  
 Dictionary<TKey, TValue>原始碼地址：https://github.com/dotnet/corefx/blob/master/src/System.Collections/src/System/Collections/Generic/Dictionary. 

  
 

    

    
    .net原始碼分析 - ConcurrentDictionary泛型
       
  
  
 繼上篇Dictionary原始碼分析，上篇講過的在這裡不會再重複 
 ConcurrentDictionary原始碼地址： 
 https://github.com/dotnet/corefx/blob/master/src/System.Collections.Concurrent/src 

  
 

    

    
    STL原始碼分析：泛型程式設計與STL
      
                定義抽象的concepts，並根據抽象的concepts來撰寫演算法與資料結構，是泛型程式設計的本質。運用STL時的幾個最重要的觀念：1.所謂使用STL，就是去擴充它。2.STL的演算法和容器是獨立分離的。3.無須繼承。4.抽象化並不意味效率低。STL所實現的，是依據泛型思維 

  
 

    

    
    C#中Dictionary泛型集合7種常見的用法
      程序集   c#   border   adding   設置   type   其它   else   ring   要使用Dictionary集合，需要導入C#泛型命名空間
 System.Collections.Generic（程序集：mscorlib）
 Dictionary的描述1、從一組鍵（Key 

  
 

    

    
    Net學習日記_泛型與反射
      圖片   父類   兩個   實現   類型   類型參數   簽名   blog   學習   

 


 

 






 

真正的重載：兩個帶不同個數的類型參數的泛型方法構成重載（overload）
泛型方法的類型參數和方法參數列表都是構成重載的元素。

子類重寫父類方法的是方法體實現代碼，而不 

  
 

    

    
    2018-10-14 Dictionary泛型集合之基本使用
       
 
 1.Dictionary集合是一種“鍵值對”集合。 
 每個資料都是有兩部分組成-“鍵”  “值”。 
 在字典集合中，我們是根據“鍵”去找值，這一點和List<T>不同。 
 在字典集合中，鍵必須是唯一的，而值是可以有重複的。 
 List<T>泛型集合，我們只限 

  
 

    

    
    .net原始碼分析 – List
       
  
  
 通過分析原始碼可以更好理解List<T>的工作方式，幫助我們寫出更穩定的程式碼。 
 List<T>原始碼地址: https://github.com/dotnet/corefx/blob/master/src/System.Collections/src/Syste 

  
 

    

    
    C#中Dictionary泛型字典說明和使用方法
      
                

說明
    必須包含名空間System.Collection.Generic 
    Dictionary裡面的每一個元素都是一個鍵值對(由二個元素組成：鍵和值) 
    鍵必須是唯一的,而值不需要唯一的 
    鍵和值都可以是任何型別(比如：string, in 

  
 

    

    
    List和Dictionary泛型類查詢效率淺析
      
                        List和Dictionary泛型類查詢效率存在巨大差異，前段時間親歷了一次。事情的背景是開發一個匹配程式，將書籍（BookID）推薦給網友（UserID），生成今日推薦資料時，有條規則是同一書籍七日內不能推薦給同一網友。         同一書籍七日內不能推 

  
 

    

    
    從位元組碼角度分析java泛型陣列的問題
      
                
關於java的泛型陣列這個問題，之前就有遇到過，不過當時以為是自己程式碼語法錯誤的問題，現在系統地對java的基礎知識進行深入總結，才發現這個問題某種程度是和泛型的型別擦除機制有關，其實我覺得這個解釋有它的道路但是還是比較勉強。下面我們從位元組碼角度試圖去分析一下java某 

  
 

    

    
    C#/.NET 泛型+索引器搭建通用字典Dictionary
       
 
  
  
 C#/.NET 泛型+索引器搭建通用字典Dictionary 
 	public enum Types
	{
		X,
		Y,
		Z,
		W
	}
	public class DemoInfoMap<T>
	{
		private Dictionary<Types 

  
 

    

    
    .NET CORE 動態調用泛型方法
      gen   call   ring   type()   object   rgs   col   []   nbsp    

 1 using System;
 2 using System.Reflection;
 3 
 4 namespace DynamicCall
 5 {
 6   

  
 

    

    
    .NET編程01（泛型）
      基類   減少   inter   st3   func   basic   line   pro   ping   一：Object 類型：一切類型的父類，通過繼承，子類擁有父類一切屬性和行為；任何父類出現的地方，都可以用子類來代替；
用一個方法來完成多個方法做的事
/// <summary>  

  
 

    

    
    .NET中的泛型集合總結
      pro   顯示   接口   www   最重要的   div   類型   項目   .cn   最近對集合相關的命名空間比較感興趣，以前也就用下List<T>, Dictionary<Tkey, TValue>之類，總之，比較小白。點開N多博客，MSDN，StackOverflo 

  
 

    

    
    C#泛型Dictionary的用法實例詳解
      contains   code   medium   計算   aaa   alt   -i   硬件   ole    本文以實例形式講述了C#中的泛型Dictionary的用法。具有很好的實用價值。分享給大家供大家參考。具體如下： 泛型最常見的用途是泛型集合，命名空間System.Collections. 

  
 

    

    
    net 自定義泛型那點事
      toolbar   user   定義和使用   程序   參考   關鍵字   double   png   不同   
泛型概述
泛型是程序設計語言的一種特性。允許程序員在強類型程序設計語言中編寫代碼時定義一些可變部分，那些部分在使用前必須作出指明。各種程序設計語言和其編譯器、運行環境對泛型的支持均不一樣 

  
 

    

    
    關於泛型擦除的知識（來源於csdn地址：https://blog.csdn.net/briblue/article/details/76736356）
      lock   array   tle   來源   應該   rank   不想   專業   相關   泛型，一個孤獨的守門者。
大家可能會有疑問，我為什麽叫做泛型是一個守門者。這其實是我個人的看法而已，我的意思是說泛型沒有其看起來那麽深不可測，它並不神秘與神奇。泛型是 Java 中一個很小巧的概念，但同時 

  
 

    

    
    【從零開始搭建自己的.NET Core Api框架】（六）泛型倉儲的作用
      tar   write   ges   分享圖片   ()   dex   抽象   .sql   cut   系列目錄
一.  創建項目並集成swagger
　　1.1 創建
　　1.2 完善
二. 搭建項目整體架構
三. 集成輕量級ORM框架——SqlSugar
　　3.1 搭建環境
　　3.2 實戰篇： 

  
 

    

    
    [.net 多執行緒]ConcurrentBag原始碼分析
      ConcurrentBag根據操作執行緒，對不同執行緒分配不同的佇列進行資料操作。這樣，每個佇列只有一個執行緒在操作，不會發生併發問題。其內部實現運用了net4.0新加入的ThreadLocal執行緒本地儲存功能。各個佇列間通過連結串列維護。 
其內部結構如下： 
  
1、獲取執行緒本地佇列： 
 

  
 

    

    
    【.NET Core專案實戰-統一認證平臺】第八章 授權篇-IdentityServer4原始碼分析
      原文:
【.NET Core專案實戰-統一認證平臺】第八章 授權篇-IdentityServer4原始碼分析

【.NET Core專案實戰-統一認證平臺】開篇及目錄索引 
 
 上篇文章我介紹瞭如何在閘道器上實現客戶端自定義限流功能，基本完成了關於閘道器的一些自定義擴充套件需求，後面幾篇將介紹基於Ident