內網滲透-內網穿透工具

阿新 • • 發佈：2020-12-07

背景

redis一直以來都是以單執行緒模式執行，這裡的單執行緒指網路IO和命令的執行部分。今年釋出了6.0版本，加上了多執行緒來處理網路IO（read，write）和命令的解析。

單執行緒模式優缺點

這個想必大家都知道，簡單介紹一下。

優點：

純記憶體操作，CPU不是其效能瓶頸，開多個程序也可以更容易的使用多個CPU
無需考慮多執行緒同步，對開發友好
執行命令天然原子性
使用IO多路複用來處理大量連線，省去了執行緒上下文切換的時間

缺點：

耗時的操作將引起阻塞
單例項不能充分利用多核CPU（read/write還是需要CPU的參與在核心態與使用者態之間copy資料）

redis網路IO模型簡介

redis採用IO多路複用來管理多個網路連線，程式碼編寫採用Reactor模式。

主執行緒是一個事件迴圈。

簡單看下原始碼：

/* State of an event based program */
typedef struct aeEventLoop {
    int maxfd;   /* highest file descriptor currently registered */
    int setsize; /* max number of file descriptors tracked */
    long long timeEventNextId;
    time_t lastTime;     /* Used to detect system clock skew */
    aeFileEvent *events; /* Registered events */
    aeFiredEvent *fired; /* Fired events */
    aeTimeEvent *timeEventHead;
    int stop;
    void *apidata; /* This is used for polling API specific data */
    aeBeforeSleepProc *beforesleep;
    aeBeforeSleepProc *aftersleep;
    int flags;
} aeEventLoop;

struct redisServer {
      aeEventLoop *el;
}

el變數儲存了事件迴圈相關的資訊，其中void *apidata;儲存了IO多路複用API相關的資訊，redis封裝了select、epoll、kqueue等多種不同的IO多路複用函式，在編譯期根據平臺型別來選擇一種。

ae.c:

/* Include the best multiplexing layer supported by this system.
 * The following should be ordered by performances, descending. */
#ifdef HAVE_EVPORT
#include "ae_evport.c"
#else
    #ifdef HAVE_EPOLL
    #include "ae_epoll.c"
    #else
        #ifdef HAVE_KQUEUE
        #include "ae_kqueue.c"
        #else
        #include "ae_select.c"
        #endif
    #endif
#endif

FileEvent

FileEvent其實就是網路IO事件，為fd繫結讀寫對應的事件處理函式，當通過IO多路複用獲取到其就緒時，呼叫其繫結的處理函式。

/* File event structure */
typedef struct aeFileEvent {
    int mask; /* one of AE_(READABLE|WRITABLE|BARRIER) */
    aeFileProc *rfileProc;
    aeFileProc *wfileProc;
    void *clientData;
} aeFileEvent;

TimeEvent

TimeEvent是定時任務事件。每個定時任務都繫結一個執行函式，巧妙的利用IO多路複用API拉取就緒事件時的阻塞時間引數，來實現定時的效果。比如最近要執行的定時任務是100ms後（這裡用的迴圈遍歷的方式獲取最值，時間複雜度O(n)，可以改用跳錶之類的資料結構優化到O(log n)，應該是作者考慮到定時任務並不會特別多，所以這裡並沒有專門去做優化），那麼就讓select函式的阻塞超時時間設為100ms，這樣就可以實現一個不是特別精確的定時器。

/* Time event structure */
typedef struct aeTimeEvent {
    long long id; /* time event identifier. */
    long when_sec; /* seconds */
    long when_ms; /* milliseconds */
    aeTimeProc *timeProc;
    aeEventFinalizerProc *finalizerProc;
    void *clientData;
    struct aeTimeEvent *prev;
    struct aeTimeEvent *next;
    int refcount; /* refcount to prevent timer events from being
  		   * freed in recursive time event calls. */
} aeTimeEvent;

6.0版本引入多執行緒

IO多執行緒相關的配置

先看一下6.0版本配置檔案中關於多執行緒的引數和說明：

################################ THREADED I/O #################################

# Redis is mostly single threaded, however there are certain threaded
# operations such as UNLINK, slow I/O accesses and other things that are
# performed on side threads.
#
# Now it is also possible to handle Redis clients socket reads and writes
# in different I/O threads. Since especially writing is so slow, normally
# Redis users use pipelining in order to speed up the Redis performances per
# core, and spawn multiple instances in order to scale more. Using I/O
# threads it is possible to easily speedup two times Redis without resorting
# to pipelining nor sharding of the instance.
#
# By default threading is disabled, we suggest enabling it only in machines
# that have at least 4 or more cores, leaving at least one spare core.
# Using more than 8 threads is unlikely to help much. We also recommend using
# threaded I/O only if you actually have performance problems, with Redis
# instances being able to use a quite big percentage of CPU time, otherwise
# there is no point in using this feature.
#
# So for instance if you have a four cores boxes, try to use 2 or 3 I/O
# threads, if you have a 8 cores, try to use 6 threads. In order to
# enable I/O threads use the following configuration directive:
#
# io-threads 4
#
# Setting io-threads to 1 will just use the main thread as usual.
# When I/O threads are enabled, we only use threads for writes, that is
# to thread the write(2) syscall and transfer the client buffers to the
# socket. However it is also possible to enable threading of reads and
# protocol parsing using the following configuration directive, by setting
# it to yes:
#
# io-threads-do-reads no
#
# Usually threading reads doesn't help much.
#
# NOTE 1: This configuration directive cannot be changed at runtime via
# CONFIG SET. Aso this feature currently does not work when SSL is
# enabled.
#
# NOTE 2: If you want to test the Redis speedup using redis-benchmark, make
# sure you also run the benchmark itself in threaded mode, using the
# --threads option to match the number of Redis threads, otherwise you'll not
# be able to notice the improvements.

這裡我們需要關注以下幾點：

預設是單執行緒模式
IO多執行緒用於read、write函式
不需要開多個例項執行redis也可以輕鬆加速2倍的速度
io-threads引數指明有幾個IO執行緒
如果io-threads是1，則只有一個主執行緒，如果是2，則多開一個IO執行緒，以此類推
預設只有write函式會使用多執行緒
io-threads-do-reads控制read是否開啟多執行緒
多執行緒IO對read的幫助並不是特別大
SSL模式暫時不支援這個配置
對多執行緒IO的redis做基準測試的時候，redis-benchmark也要開啟多執行緒引數

看看原始碼

IO執行緒的主要原始碼在這裡：

關鍵全域性變數

pthread_t io_threads[IO_THREADS_MAX_NUM];
pthread_mutex_t io_threads_mutex[IO_THREADS_MAX_NUM];
_Atomic unsigned long io_threads_pending[IO_THREADS_MAX_NUM];
int io_threads_op;      /* IO_THREADS_OP_WRITE or IO_THREADS_OP_READ. */

/* This is the list of clients each thread will serve when threaded I/O is
 * used. We spawn io_threads_num-1 threads, since one is the main thread
 * itself. */
list *io_threads_list[IO_THREADS_MAX_NUM];

io_threads：pthread多執行緒結構體

io_threads_mutex：互斥鎖，用於在主執行緒控制IO執行緒的停止和執行

io_threads_pending：原子型別，和主執行緒進行同步的變數，如果io_threads_pending[i]==1說明編號為i的執行緒就緒了，可以進行讀/寫操作。

io_threads_op：當前操作是讀還是寫

io_threads_list：每個執行緒的client佇列，對於某個thread，遍歷list依次處理其下的client

關鍵程式碼邏輯

在函式中分別呼叫和來喚醒IO執行緒，分別處理讀和寫。

在handleClientsWithPendingReadsUsingThreads函式中可以看到，就緒的客戶端會均勻分配到n個IO執行緒中去執行：

/* Distribute the clients across N different lists. */
    listIter li;
    listNode *ln;
    listRewind(server.clients_pending_read,&li);
    int item_id = 0;
    while((ln = listNext(&li))) {
        client *c = listNodeValue(ln);
        int target_id = item_id % server.io_threads_num;
        listAddNodeTail(io_threads_list[target_id],c);
        item_id++;
    }

然後會通過設定io_threads_pending變數來喚醒IO執行緒，假設設定了io-threads=4則會有io-threads - 1 = 3個額外的執行緒啟動，因為主執行緒也會作為一個IO執行緒。主執行緒處理io_threads_list[0]裡面的客戶端。

/* Give the start condition to the waiting threads, by setting the
     * start condition atomic var. */
    io_threads_op = IO_THREADS_OP_READ;
    for (int j = 1; j < server.io_threads_num; j++) {
        int count = listLength(io_threads_list[j]);
        io_threads_pending[j] = count;
    }

/* Also use the main thread to process a slice of clients. */
    listRewind(io_threads_list[0],&li);
    while((ln = listNext(&li))) {
        client *c = listNodeValue(ln);
        readQueryFromClient(c->conn);
    }
    listEmpty(io_threads_list[0]);

然後主執行緒做完IO操作之後，會死迴圈等待其他IO執行緒完成讀操作，才會執行命令的執行，這個時候讀取資料和解析命令已經在IO執行緒中完成了，主執行緒執行命令，保證了命令執行的原子性。

/* Wait for all the other threads to end their work. */
    while(1) {
        unsigned long pending = 0;
        for (int j = 1; j < server.io_threads_num; j++)
            pending += io_threads_pending[j];
        if (pending == 0) break;
    }
    if (tio_debug) printf("I/O READ All threads finshed\n");

    /* Run the list of clients again to process the new buffers. */
    while(listLength(server.clients_pending_read)) {
        ln = listFirst(server.clients_pending_read);
        client *c = listNodeValue(ln);
        c->flags &= ~CLIENT_PENDING_READ;
        listDelNode(server.clients_pending_read,ln);

        if (c->flags & CLIENT_PENDING_COMMAND) {
            c->flags &= ~CLIENT_PENDING_COMMAND;
            if (processCommandAndResetClient(c) == C_ERR) {
                /* If the client is no longer valid, we avoid
                 * processing the client later. So we just go
                 * to the next. */
                continue;
            }
        }
        processInputBuffer(c);
    }

IO執行緒的執行邏輯在中：

死迴圈中等待io_threads_pending被設定為非零值，這裡如果死迴圈一直輪詢會把CPU吃滿，所以這裡還有一個互斥鎖io_threads_mutex來暫停IO執行緒，使其阻塞在pthread_mutex_lock這裡。

 /* Wait for start */
        for (int j = 0; j < 1000000; j++) {
            if (io_threads_pending[id] != 0) break;
        }

        /* Give the main thread a chance to stop this thread. */
        if (io_threads_pending[id] == 0) {
            pthread_mutex_lock(&io_threads_mutex[id]);
            pthread_mutex_unlock(&io_threads_mutex[id]);
            continue;
        }

接下來就是根據io_threads_op來區分是讀還是寫，去執行read或write

/* Process: note that the main thread will never touch our list
         * before we drop the pending count to 0. */
        listIter li;
        listNode *ln;
        listRewind(io_threads_list[id],&li);
        while((ln = listNext(&li))) {
            client *c = listNodeValue(ln);
            if (io_threads_op == IO_THREADS_OP_WRITE) {
                writeToClient(c,0);
            } else if (io_threads_op == IO_THREADS_OP_READ) {
                readQueryFromClient(c->conn);
            } else {
                serverPanic("io_threads_op value is unknown");
            }
        }
        listEmpty(io_threads_list[id]);
        io_threads_pending[id] = 0;

最後看下後臺程序

io-threads=4，會多額外的3個io-thread：

top -Hp 17339

IO多執行緒模式流程

效能測試

這裡使用redis自帶的基準測試工具redis-benchmark來進行測試。

對比圖

詳情資料

redis6.0.9：IO執行緒數是4

多執行緒：./redis-benchmark -c 1000 -n 1000000 --threads 4 --csv

單執行緒：./redis-benchmark -c 1000 -n 1000000 --csv

表格：

cmd	4 threads & read yes	4 threads & read no	1 thread
PING_INLINE	472589.81	363240.09	215610.17
PING_BULK	515198.34	423908.44	213766.56
SET	442673.75	372162.25	213401.62
GET	476644.41	400320.28	212901.84
INCR	460829.47	389559.81	214408.23
LPUSH	399520.56	346500.34	220896.84
RPUSH	430292.62	358680.03	217391.31
LPOP	404203.72	344946.53	222024.86
RPOP	399680.25	333111.25	215517.25
SADD	450856.66	363372.09	216590.86
HSET	399680.25	333111.25	217344.06
SPOP	486854.94	405350.62	213401.62
ZADD	415627.62	333222.28	217912.39
ZPOPMIN	444049.72	402900.88	216122.77
LPUSH (needed to benchmark LRANGE)	410677.62	342114.25	218914.19
LRANGE_100 (first 100 elements)	113869.28	110168.56	75483.09
LRANGE_300 (first 300 elements)	45687.13	44081.99	27139.99
LRANGE_500 (first 450 elements)	31991.81	31406.05	20085.56
LRANGE_600 (first 600 elements)	24688.31	23973.34	15635.75
MSET (10 keys)	226244.34	200240.30	175500.17

總結

redis6.0之後針對網路IO增加了多執行緒，IO執行緒中只負責read、解析command、write操作，命令執行操作還是在主執行緒，依然具有原子性。
開啟四個IO執行緒的情況下，GET和SET操作，相對於單執行緒模式，開啟write+read多執行緒，效能為原來的2倍，只開啟write多執行緒，效能為原來的1.68倍

最後大家可以考慮下，為腎麼，多執行緒執行read速度提升並不明顯？

內網滲透-內網穿透工具

背景

單執行緒模式優缺點

優點：

缺點：

redis網路IO模型簡介

FileEvent

TimeEvent

6.0版本引入多執行緒

IO多執行緒相關的配置

看看原始碼

關鍵全域性變數

關鍵程式碼邏輯

最後看下後臺程序

IO多執行緒模式流程

效能測試

對比圖

詳情資料

總結

內網滲透-內網穿透工具

ngrok | 內網穿透工具

分享一個免費內網穿透工具

內網穿透工具對比FRP+NPS+Zerotier與NAT伺服器測試

Centos7安裝FRP內網穿透工具

個人學習、自用，最簡單的內網穿透工具，釘釘內網穿透使用

內網滲透—流量轉發

記一次利用cs進行內網滲透的過程

內網滲透-dns隧道通訊原理&特徵

史上最強內網滲透知識點總結

內網滲透tips集合

內網之一次成功的內網滲透

內網滲透之通道構建

內網滲透-域內資訊收集

內網滲透-域內有網和無網

內網滲透之域賬戶查詢

內網滲透-防火牆資訊

內網滲透-本機資訊蒐集

記一次metasploitable2內網滲透之21，22，23埠爆破

記一次metasploitable2內網滲透之samba服務的攻擊

內網滲透-內網穿透工具

背景

單執行緒模式優缺點

優點：

缺點：

redis網路IO模型簡介

FileEvent

TimeEvent

6.0版本引入多執行緒

IO多執行緒相關的配置

看看原始碼

關鍵全域性變數

關鍵程式碼邏輯

最後看下後臺程序

IO多執行緒模式流程

效能測試

對比圖

詳情資料

總結

相關推薦