Lighttpd1.4.20原始碼分析筆記通用陣列array.c(h)

阿新 • • 發佈：2019-02-08

首先回顧以下，在array.h中，UNSET型別是一個巨集：

#define DATA_UNSET \
    data_type_t type; \
    buffer *key; \
    int is_index_key; /* 1 if key is a array index (autogenerated keys) */ \
    struct data_unset *(*copy)(const struct data_unset *src); \
    void (* free)(struct data_unset *p); \
    void (* reset)(struct 
 data_unset *p); \
    int (*insert_dup)(struct data_unset *dst, struct data_unset *src); \
    void (*print)(const struct data_unset *p, int depth)

typedef struct data_unset {
    DATA_UNSET;
} data_unset;

使用巨集DATA_UNSET，這樣可以方便其他型別在定義中直接引用DATA_UNSET巨集來模擬繼承。在巨集DATA_UNSET中，定義了下面五個函式指標：

struct data_unset *(*copy)(const 
 struct data_unset *src); 
void (* free)(struct data_unset *p); 
void (* reset)(struct data_unset *p); 
int (*insert_dup)(struct data_unset *dst, struct data_unset *src); 
void (*print)(const struct data_unset *p, int depth)

這些函式指標相當於UNSET的成員函式，其他型別可以通過對這五個指標賦值來實現成員函式的重寫（Overwrite）。每種型別都配有自己特有的初始化函式，在這些初始化函式中，對上面這五個函式指標進行賦值。

作者很巧妙地用面向物件的思想來組織C程式碼。

我們可以例項性地看下STRING型別的初始化函式（data_string.c）：

data_string *data_string_init(void) {
    data_string *ds;

    ds = calloc(1, sizeof(*ds)); //分配的空間會自動清零
    assert(ds);
    /* 初始化資料成員，buffer_init用來分配記憶體空間 */
    ds->key = buffer_init();
    ds->value = buffer_init();
    /* 成員函式，對函式指標賦值 */
    ds->copy = data_string_copy;
    ds->free = data_string_free;
    ds->reset = data_string_reset;
    ds->insert_dup = data_string_insert_dup;
    ds->print = data_string_print;
    ds->type = TYPE_STRING;

    return ds;
}

array.h中，各個資料型別的標誌的定義：

typedef enum { 
        TYPE_UNSET,         /* 資料的型別未設定，
                               這幾種資料型別使用了面向物件的設計思想，
                               這個型別相當於父型別 
                             */
        TYPE_STRING,         /* 字串型別 */
        TYPE_COUNT,         /* COUNT型別 */
        TYPE_ARRAY,         /* 陣列型別 */
        TYPE_INTEGER,     /* 整數型別 */
        TYPE_FASTCGI,     /* FASTCGI型別 */
        TYPE_CONFIG         /* CONFIG型別 */
} data_type_t;

除了UNSET型別，其他型別的操作函式的實現都在檔案data_XXX.c中。這七個型別構成了通用陣列所要處理的型別。

為何叫做通用陣列呢？

因為在陣列的定義和實現中只使用UNSET型別，基於上面的定義，通用陣列可以不用關心陣列中儲存的到底是哪種具體的型別，只需將其按照UNSET型別來處理就可以了，所以說是通用的。

下面這個定義是通用陣列的核心定義：

typedef struct 
{
    /* UNSET型別的指標型陣列，存放陣列中的元素 */
    data_unset **data;
    /* 按 data 資料的排序順序儲存 data 的索引 */
    size_t *sorted;
    size_t used;    /* data中已經使用了的長度，也就是陣列中元素個數 */

        /* data的大小。data的大小會根據資料的多少變化，會為以後的資料預先分配空間 */
    size_t size;    
    /* 用於儲存唯一索引,初始為 0,之後遞增 */
    size_t unique_ndx;
    /* 比used大的最小的2的倍數。也就是離used最近的且比used大的2的倍數 ，用於在陣列中利用二分法查詢元素*/
    size_t next_power_of_2;
    /* data is weakref, don't bother the data */
    /* data就是一個指標，不用關係其所指向的內容 */
    int is_weakref;                
} array;

sorted（圖中僅展示data_unset中的key）：

這裡寫圖片描述

還有一個定義：

typedef struct {
    DATA_UNSET;

    array *value;
} data_array;

它定義了一個array型別的資料，也就是說，通用陣列中存放的資料可以是通用陣列，這樣可以形成多維的通用陣列。

在array.h中定義瞭如下的通用陣列操作函式：

1、array *array_init(void);
初始化陣列，分配空間。

2、array array_init_array(array a);
用陣列a來初始化一個數組。也就是得到一個a的深拷貝。

3、void array_free(array * a);
釋放陣列。釋放所有空間。

4、void array_reset(array * a);
重置data中的所有資料（呼叫UNSET型別資料中的reset函式），並將used設為0。相當於清空陣列。

5、int array_insert_unique(array * a, data_unset * str);
將str插入到陣列中，如果陣列中存在key與str相同的資料，則把str的內容拷貝到這個資料中。

6、data_unset array_pop(array a);
彈出data中的最後一個元素，返回其指標，data中的最後一個位置設為NULL。

7、int array_print(array * a, int depth);
列印陣列中的內容。depth引數用於在列印多維陣列時，實現縮排。

8、a_unset array_get_unused_element(array a, data_type_t t);
返回第一個未使用的資料，也就是used位置的資料，這個資料不在陣列中，返回這個資料指標後，將data[unsed]設為NULL。可能返回NULL。

9、data_unset array_get_element(array a, const char *key);
根據key值，返回陣列中key值與之相同的資料

10、data_unset array_replace(array a, data_unset * du);
如果陣列中有與du的key值相同的資料，則用du替換那個資料，並返回那個資料的指標。如果不存在，則把du插入到陣列中。（呼叫data_insert_unique函式）

11、 int array_strcasecmp(const char *a, size_t a_len, const char *b, size_t b_len);
這個函式並沒實現，僅僅給出了上面的定義。

12、void array_print_indent(int depth);
根據depth列印空白，實現縮排。

13、size_t array_get_max_key_length(array * a);
返回陣列中最長的key的長度。

下面看看array_get_index函式：

/* 
 * sorted陣列是個下標陣列，存放的是排好序的輸入元素的下標（見前面的圖），
 * 相當於一個排好序的陣列。
 * 利用sorted陣列進行二分查詢。
 * 若找到，返回元素在data陣列中的位置，並通過rndx返回
 * 其在sorted陣列中的位置。
 * 若沒有找到，通過rndx返回此元素在sorted中的位置，並返回-1
 */
static int array_get_index(array *a, const char *key, size_t keylen, int *rndx) {
    int ndx = -1;
    int i, pos = 0; /* pos中存放的是元素在陣列data中的位置 */

    if (key == NULL) return -1;
/*  
 * 當data的空間不夠時，通用陣列每次為data增加16個空間，第一次初始化時，
 * data的長度為16。因此，size始終是16的倍數。
 * 而next_power_of_2是大於used最小的2的倍數，如used=5,那麼
 * next_power_of_2就等於8。
 * 這樣，used始終大於等於next_power_of_2的1/2。
 * 
 * next_power_of_2類似於一個標杆，利用這個標杆進行二分搜尋
 * 可以減少很多出錯的機率，也使程式更加易懂。
 */
    /* try to find the string */
    for (i = pos = a->next_power_of_2 / 2; ; i >>= 1) {
        int cmp;

        if (pos < 0) {
            pos += i;
        } else if (pos >= (int)a->used) {
            pos -= i;
        } else {
         /* 比較兩個元素的key值 */
            cmp = buffer_caseless_compare(key, keylen, a->data[a->sorted[pos]]->key->ptr, a->data[a->sorted[pos]]->key->used);

            if (cmp == 0) {
                /* found */
                ndx = a->sorted[pos];
                break;
            } else if (cmp < 0) {
                pos -= i; /* 所找資料在前半部分 */
            } else {
                pos += i; /* 所找資料在後半部分 */
            }
        }
        if (i == 0) break;
    }

    if (rndx) *rndx = pos;

    return ndx;
}

本資料結構的實現中，二分查詢是一個特色，然後用sorted陣列只對data中的資料的下標排序，也是一個很有用的技巧。

Lighttpd1.4.20原始碼分析筆記通用陣列array.c(h)

Lighttpd1.4.20原始碼分析筆記通用陣列array.c(h)

Lighttpd1.4.20原始碼分析筆記狀態機之請求處理

Lighttpd1.4.20原始碼分析筆記網路服務主模型

Lighttpd1.4.20原始碼分析筆記 fdevent系統-初始化

lighttpd1.4.20原始碼分析：安裝與配置

mybatis3.4.6 原始碼分析筆記

Lighttpd1.4.20源代碼分析筆記狀態機之錯誤處理和連接關閉

Opencv2.4.9原始碼分析——HoughCircles

Spring原始碼分析筆記--事務管理

mybatis3.4.6 原始碼分析

Clamav防毒軟體原始碼分析筆記八

lighttpd1.4.18程式碼分析(八)--狀態機(2)CON_STATE_READ狀態

lighttpd1.4.18程式碼分析(六)--處理連線fd的流程

Apache commons-pool2-2.4.2原始碼學習筆記

Opencv2.4.9原始碼分析——SURF

Opencv2.4.9原始碼分析——SIFT

Android4.4.2原始碼分析之WiFi模組（一）

虛幻4引擎原始碼學習筆記(二)：主迴圈LaunchEngineLoop

Opencv2.4.9原始碼分析——Stitching（六）

LEMON原始碼分析筆記——壓縮分析表

Lighttpd1.4.20原始碼分析 筆記 通用陣列array.c(h)

相關推薦

Lighttpd1.4.20原始碼分析筆記通用陣列array.c(h)