cpu負載和系統負載計算原理

阿新 • • 發佈：2020-11-27

排程中的負載概念，與平時熟知的cpu佔用率並不是一回事，兩者間有較大差別。本文分析了cpu負載和系統負載，並非CPU使用率。程式碼基於CAF- SM8250 - kernel 4.19。

負載計算中，其實主要分為3大部分，由小到大依次為：

1、排程實體負載：update_load_avg（）----PELT

2、CPU負載：update_cpu_load_active()

3、系統負載：calc_global_load_tick()

這3個級別的負載計算分別體現了不同維度下的當前負載情況。

之前blog分析了排程實體sched entity級別的負載計算，是通過PELT機制來統計的。它體現了每一個排程實體的對cpu產生的負載情況，並進行了實時統計更新。（這塊之前分析，覆盤了下，還是不太詳細和明確，後續還要再總結一下）

這次主要分析餘下2個：CPU負載和系統負載的計算原理。

CPU負載計算原理

CPU負載是用來體現當前CPU的工作任務loading情況，和CPU繁忙程度的。其主要通過統計CPU rq上task處於runnable的平均時間（runnable_load_avg = runnable_load_sum /LOAD_AVG_MAX）。並根據不同週期，統計出不同的k線，來體現CPU負載的變化趨勢。

我們知道單個task處於runnable的平均時間是由PELT演算法機制來完成統計的。所以，我們此次分析更偏向於如何利用統計出的單個task資料，再進一步統計出不同週期的均線，來表示CPU負載。

程式碼路徑如下：

scheduler_tick()
-> cpu_load_update_active()

void cpu_load_update_active(struct rq *this_rq)  
{
    unsigned long load = weighted_cpuload(this_rq); //load = cfs_rq->avg.runnable_load_avg

    if (tick_nohz_tick_stopped())
        cpu_load_update_nohz(this_rq, READ_ONCE(jiffies), load);     
//(1)
    else
        cpu_load_update_periodic(this_rq, load);    //(2)
}

上面根據是否配置nohz而可能停止了tick走向不同的分支：

（1）cpu_load_update_nohz(this_rq, READ_ONCE(jiffies), load) //更新cpu負載，基於no HZ場景

（2）cpu_load_update_periodic()  //週期性更新cpu負載，基於有HZ場景

NO_HZ的情況下，會走這個分支。pending_updates表示jiffies數是否更新，即表示tick數。經過的秒數 = jiffies / HZ（平臺當前HZ=250）；一個tick為HZ倒數，即4ms。

/*
 * There is no sane way to deal with nohz on smp when using jiffies because the
 * CPU doing the jiffies update might drift wrt the CPU doing the jiffy reading
 * causing off-by-one errors in observed deltas; {0,2} instead of {1,1}.
 *
 * Therefore we need to avoid the delta approach from the regular tick when
 * possible since that would seriously skew the load calculation. This is why we
 * use cpu_load_update_periodic() for CPUs out of nohz. However we'll rely on
 * jiffies deltas for updates happening while in nohz mode (idle ticks, idle
 * loop exit, nohz_idle_balance, nohz full exit...)
 *
 * This means we might still be one tick off for nohz periods.
 */

static void cpu_load_update_nohz(struct rq *this_rq,
                 unsigned long curr_jiffies,
                 unsigned long load)
{
    unsigned long pending_updates;

    pending_updates = curr_jiffies - this_rq->last_load_update_tick;    //計算pending_updates
    if (pending_updates) {
        this_rq->last_load_update_tick = curr_jiffies;  //更新時間戳
        /*
         * In the regular NOHZ case, we were idle, this means load 0.
         * In the NOHZ_FULL case, we were non-idle, we should consider
         * its weighted load.
         */
        cpu_load_update(this_rq, load, pending_updates);    //(2-1)更新cpu rq的cpu load資料
    }
}

我們再看（2）這個分支：

```
static void cpu_load_update_periodic(struct rq *this_rq, unsigned long load)
{
#ifdef CONFIG_NO_HZ_COMMON
    /* See the mess around cpu_load_update_nohz(). */
    this_rq->last_load_update_tick = READ_ONCE(jiffies);    //記錄更新的時間戳
#endif
    cpu_load_update(this_rq, load, 1);  //(2-1)更新cpu rq的cpu load資料
}
```

最終都是呼叫了cpu_load_update（）來更新cpu rq的cpu load資料。

(2-1)更新cpu rq的cpu load資料

/**
 * __cpu_load_update - update the rq->cpu_load[] statistics
 * @this_rq: The rq to update statistics for
 * @this_load: The current load
 * @pending_updates: The number of missed updates
 *
 * Update rq->cpu_load[] statistics. This function is usually called every
 * scheduler tick (TICK_NSEC).
 *
 * This function computes a decaying average:
 *
 *   load[i]' = (1 - 1/2^i) * load[i] + (1/2^i) * load
 *
 * Because of NOHZ it might not get called on every tick which gives need for
 * the @pending_updates argument.
 *
 *   load[i]_n = (1 - 1/2^i) * load[i]_n-1 + (1/2^i) * load_n-1
 *             = A * load[i]_n-1 + B ; A := (1 - 1/2^i), B := (1/2^i) * load
 *             = A * (A * load[i]_n-2 + B) + B
 *             = A * (A * (A * load[i]_n-3 + B) + B) + B
 *             = A^3 * load[i]_n-3 + (A^2 + A + 1) * B
 *             = A^n * load[i]_0 + (A^(n-1) + A^(n-2) + ... + 1) * B
 *             = A^n * load[i]_0 + ((1 - A^n) / (1 - A)) * B
 *             = (1 - 1/2^i)^n * (load[i]_0 - load) + load
 *
 * In the above we've assumed load_n := load, which is true for NOHZ_FULL as
 * any change in load would have resulted in the tick being turned back on.
 *
 * For regular NOHZ, this reduces to:
 *
 *   load[i]_n = (1 - 1/2^i)^n * load[i]_0
 *
 * see decay_load_misses(). For NOHZ_FULL we get to subtract and add the extra
 * term.
 */
static void cpu_load_update(struct rq *this_rq, unsigned long this_load,
                unsigned long pending_updates)
{
    unsigned long __maybe_unused tickless_load = this_rq->cpu_load[0];  //獲取之前週期為0的均線cpu load
    int i, scale;

    this_rq->nr_load_updates++;     //統計cpu load更新次數

    /* Update our load: */
    this_rq->cpu_load[0] = this_load; /* Fasttrack for idx 0 */         //更新cpu load[0]
    for (i = 1, scale = 2; i < CPU_LOAD_IDX_MAX; i++, scale += scale) { //更新cpu load[1-4]
        unsigned long old_load, new_load;

        /* scale is effectively 1 << i now, and >> i divides by scale */

        old_load = this_rq->cpu_load[i];
#ifdef CONFIG_NO_HZ_COMMON 
        old_load = decay_load_missed(old_load, pending_updates - 1, i);     //（2-1-1）將原先的old load做老化；如果pending_update == 1，就不用做負載老化
        if (tickless_load) {    　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　 //如果之前cpu load[0]有負載
            old_load -= decay_load_missed(tickless_load, pending_updates - 1, i);  //那麼還要對tickless_load進行老化【針對這裡為什麼要新增tickless_load的考慮，後面說明】
            /*
             * old_load can never be a negative value because a
             * decayed tickless_load cannot be greater than the
             * original tickless_load.
             */
            old_load += tickless_load;
        }
#endif
        new_load = this_load;
        /*
         * Round up the averaging division if load is increasing. This
         * prevents us from getting stuck on 9 if the load is 10, for
         * example.
         */
        if (new_load > old_load)        //這裡是做了一個補償，防止由於old load小於new load情況下，最終的cpu load都不可能達到最大值
            new_load += scale - 1;

        this_rq->cpu_load[i] = (old_load * (scale - 1) + new_load) >> i;    //計算當前新的cpu load
    }
}

首先為了便於理解，我們先關注程式碼最後計算更新當前新的cpu load，實際為5份資料。分別對應不同週期長度的均線資料：

解釋一下上面表格的意思：

針對CPU負載計算，會統計5條不同週期的均線，週期分別為{0，8，32，64，128}，單位為tick。可以理解為統計了從當前tick開始往前推週期tick個數的load資料，週期內沒有load，則load就會歸0。

而在更新時，採用的計算公式：load = (2^idx - 1) / 2^idx * load + 1 / 2^idx * cur_load ----idx就是表格中【i】，load為old load，cur_load為新的load

那麼整理後，為：當前load = 更新系數 *舊的load + （1-更新系數）* 新的load ---更新系數為 (2^idx - 1) / 2^idx，最終參考【cpu_load】計算公式

現在我們來看，關於tickless系統（我們當前就是tickless/NO_HZ系統），錯過了一些tick，那麼就需要先將old load先老化，再考慮new load，重新計算當前的cpu load[1-4]。這裡使用了查表法來減少計算量，提升效能。

PS：因為NO_HZ一般是出現休眠。那麼休眠時，load是被認為為0的，所以，其實只需要考慮將舊的load進行衰減就可以了。

（2-1-1）將原先的old load做老化；如果pending_update == 1，就不用做負載老化

/*
 * The exact cpuload calculated at every tick would be:
 *
 *   load' = (1 - 1/2^i) * load + (1/2^i) * cur_load
 *
 * If a CPU misses updates for n ticks (as it was idle) and update gets
 * called on the n+1-th tick when CPU may be busy, then we have:
 *
 *   load_n   = (1 - 1/2^i)^n * load_0
 *   load_n+1 = (1 - 1/2^i)   * load_n + (1/2^i) * cur_load
 *
 * decay_load_missed() below does efficient calculation of
 *
 *   load' = (1 - 1/2^i)^n * load
 *
 * Because x^(n+m) := x^n * x^m we can decompose any x^n in power-of-2 factors.
 * This allows us to precompute the above in said factors, thereby allowing the
 * reduction of an arbitrary n in O(log_2 n) steps. (See also
 * fixed_power_int())
 *
 * The calculation is approximated on a 128 point scale.
 */
#define DEGRADE_SHIFT        7

static const u8 degrade_zero_ticks[CPU_LOAD_IDX_MAX] = {0, 8, 32, 64, 128};
static const u8 degrade_factor[CPU_LOAD_IDX_MAX][DEGRADE_SHIFT + 1] = {
    {   0,   0,  0,  0,  0,  0, 0, 0 },
    {  64,  32,  8,  0,  0,  0, 0, 0 },
    {  96,  72, 40, 12,  1,  0, 0, 0 },
    { 112,  98, 75, 43, 15,  1, 0, 0 },
    { 120, 112, 98, 76, 45, 16, 2, 0 }
};

/*
 * Update cpu_load for any missed ticks, due to tickless idle. The backlog
 * would be when CPU is idle and so we just decay the old load without
 * adding any new load.
 */
static unsigned long
decay_load_missed(unsigned long load, unsigned long missed_updates, int idx)
{
    int j = 0;

    if (!missed_updates)
        return load;

    if (missed_updates >= degrade_zero_ticks[idx])  //不同的統計週期，超過了一定tick數，說明系統已經sleep這麼長時間，那麼old load就需要被清空了
        return 0;

    if (idx == 1)
        return load >> missed_updates;  //如果是週期為2均線，就直接根據missed_updates的個數，除以2次冪就行了，因為old和new的佔比是各為1/2

    while (missed_updates) {
        if (missed_updates % 2)
            load = (load * degrade_factor[idx][j]) >> DEGRADE_SHIFT;

        missed_updates >>= 1;
        j++;
    }
    return load;
}

針對不同週期線的均線資料，老化的計算都是使用同一個公式：

load_n = (1 - 1/2^i)^n * load_0　　---i就是idx，n則是pending_update-1，也就是missed_updates

tickless_load

再說說tickless_load，我發現在之前的kernel版本上是沒有的。也就是說，原先cpu負載的老化是完全按照上述2個公式嚴格計算的。

而現在加入了tickless_load，是考慮了上一次更新tick中cpu_load[0]的負載資料。將其也更新進了舊load（old_load）中。

個人認為這樣做的目的是這樣的，假設如下場景：

1. 上一次該均線load更新，此時load較小-----這部分load視作舊load

2. 期間有短暫的load劇增------這部分load視作tickless_load

3. 還沒有到該均線load更新，進入休眠

4. 喚醒時刻load也比較小

5. 此次該均線load更新，此時load也比較小（與喚醒時刻load一致）

但是此時，如果不考慮ticklees load。那麼該均線load此次更新會僅考慮舊load的衰減，再加上更新均線此時load，經過演算法計算得出。但是此時的load與實際體現有較大差距，因為週期越長，那麼新load更新時所佔比例小，舊load所佔比例大。所以，導致此時最終load結果與平均load有一部分相差，因為沒有考慮到tickless load的負載，無法體現CPU平均負載真實情況。

而加入tickless_load之後，即將休眠前那部分load也會加入舊load計算。這樣避免了那部分短暫load的計算缺失。

所以考慮tickless_load的最終老化的公式應該是這樣的：

load_n = (1 - 1/2^i)^n * load_0 + [1 - (1 - 1/2^i)^n ] *tickless_load

最後結果load_n就是更新後的old_load，最後再用上面表格【cpu_load】計算公式，計算出最新的cpu負載資料。

tickless load這塊的理解屬於我個人理解，最好還是通過檢查kernel提交記錄檢視commit資訊。如有理解不對的地方，請指出。

CPU負載為何如此設計？

1、不同週期的均線用來反應不同時間視窗長度下的負載情況，主要供load_balance()在不同場景判斷是否負載平衡的比較基準，常用為cpu_load[0]和cpu_load[1]（這塊後續分析load_balance時，需要check清楚）

2、使用不同週期的均線目的在於平滑樣本的抖動，確定趨勢的變化方向

系統負載計算原理

系統級的平均負載(load average)可以通過以下命令(uptime、top、cat /proc/loadavg)檢視：

$ uptime
 16:48:24 up  4:11,  1 user,  load average: 25.25, 23.40, 23.46

$ top - 16:48:42 up  4:12,  1 user,  load average: 25.25, 23.14, 23.37

$ cat /proc/loadavg 
25.72 23.19 23.35 42/3411 43603

“load average:”後面的3個數字分別表示1分鐘、5分鐘、15分鐘的load average。可以從幾方面去解析load average：

If the averages are 0.0, then your system is idle.

If the 1 minute average is higher than the 5 or 15 minute averages, then load is increasing.

If the 1 minute average is lower than the 5 or 15 minute averages, then load is decreasing.

If they are higher than your CPU count, then you might have a performance problem (it depends).

最早的系統級平均負載(load average)只會統計runnable狀態，但是linux後面覺得這種統計方式代表不了系統的真實負載。
舉一個例子：系統換一個低速硬碟後，他的runnable負載還會小於高速硬碟時的值；linux認為睡眠狀態(TASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE)也是系統的一種負載，系統得不到服務是因為io/外設的負載過重。
系統級負載統計函式calc_global_load_tick()中會把(this_rq->nr_running+this_rq->nr_uninterruptible)都計入負載；

程式碼解析如下：

scheduler_tick()
->  calc_global_load_tick()

/*
 * Called from scheduler_tick() to periodically update this CPU's
 * active count.
 */
void calc_global_load_tick(struct rq *this_rq)
{
    long delta;

    if (time_before(jiffies, this_rq->calc_load_update))    //過濾系統負載已經更新了的情況
        return;

    delta  = calc_load_fold_active(this_rq, 0);     //(1)更新nr_running + uninterruptible的task數量
    if (delta)
        atomic_long_add(delta, &calc_load_tasks);   //統計task數量到系統全域性變數calc_load_tasks中

    this_rq->calc_load_update += LOAD_FREQ;     //下一次更新系統負載的時間（當前時間+5s）
}

long calc_load_fold_active(struct rq *this_rq, long adjust)
{
    long nr_active, delta = 0;

    nr_active = this_rq->nr_running - adjust;
    nr_active += (long)this_rq->nr_uninterruptible;     //統計所有nr_running和uninterruptible的task數量

    if (nr_active != this_rq->calc_load_active) {       //this_rq->calc_load_active表示當前nr_running + uninterruptible的task數量
        delta = nr_active - this_rq->calc_load_active;  //計算差值
        this_rq->calc_load_active = nr_active;      //更新task數量
    }

    return delta;
}

上面這部分主要是每隔5s以上，更新系統中nr_running + uninterruptible的task數量的，並統計到全域性變數calc_load_tasks中。

此外，還有一部分是在系統jiffies更新時，觸發計算系統負載動作。
系統中註冊了tick_setup_sched_timer，用於模擬tick、高精度定時器、以及更新jiffies等，其中也會對系統負載進行計算。
當timer觸發時，就會執行tick_sched_timer。當前cpu為tick_do_timer_cpu時，就會進行系統負載計算。所以，計算負載僅由一個從cpu完成，它就是tick_do_timer_cpu。

tick_sched_timer()
    -> tick_sched_do_timer()
        ->tick_do_update_jiffies64()
           -> do_timer()
                -> calc_global_load()

/*
 * calc_load - update the avenrun load estimates 10 ticks after the
 * CPUs have updated calc_load_tasks.
 *
 * Called from the global timer code.
 */
void calc_global_load(unsigned long ticks)
{
    unsigned long sample_window;
    long active, delta;

    sample_window = READ_ONCE(calc_load_update);    //獲取scheduler_tick中更新的新時間戳
    if (time_before(jiffies, sample_window + 10))   //確保在時間戳之後的10個tick後（確保所有cpu都更新完calc_load_tasks），進行系統負載計算(所以總的間隔時間時5s + 10 tick)
        return;

    /*
     * Fold the 'old' NO_HZ-delta to include all NO_HZ CPUs.
     */
    delta = calc_load_nohz_fold();      //(2)統計NO_HZ cpu的task數量（是否因為idle而錯過了統計task數量，所以在這裡更新一下？）
    if (delta)
        atomic_long_add(delta, &calc_load_tasks);   //更新nr_running + uninterrunptible的task數量全域性變數

    active = atomic_long_read(&calc_load_tasks);    //獲取nr_running + uninterrunptible的task數量全域性變數
    active = active > 0 ? active * FIXED_1 : 0;     //乘FIXED_1係數

    avenrun[0] = calc_load(avenrun[0], EXP_1, active);      //(3)計算1分鐘的系統負載
    avenrun[1] = calc_load(avenrun[1], EXP_5, active);      //計算5分鐘的系統負載
    avenrun[2] = calc_load(avenrun[2], EXP_15, active);     //計算15分鐘的系統負載

    WRITE_ONCE(calc_load_update, sample_window + LOAD_FREQ);        //更新時間戳

    /*
     * In case we went to NO_HZ for multiple LOAD_FREQ intervals
     * catch up in bulk.
     */
    calc_global_nohz();     //(4)
}

(2)統計NO_HZ cpu的task數量（是否因為idle而錯過了統計task數量，所以在這裡更新一下？）

static long calc_load_nohz_fold(void)
{
    int idx = calc_load_read_idx();
    long delta = 0;

    if (atomic_long_read(&calc_load_nohz[idx]))
        delta = atomic_long_xchg(&calc_load_nohz[idx], 0);

    return delta;
}

/*
 * Handle NO_HZ for the global load-average.
 *
 * Since the above described distributed algorithm to compute the global
 * load-average relies on per-CPU sampling from the tick, it is affected by
 * NO_HZ.
 *
 * The basic idea is to fold the nr_active delta into a global NO_HZ-delta upon
 * entering NO_HZ state such that we can include this as an 'extra' CPU delta
 * when we read the global state.
 *
 * Obviously reality has to ruin such a delightfully simple scheme:
 *
 *  - When we go NO_HZ idle during the window, we can negate our sample
 *    contribution, causing under-accounting.
 *
 *    We avoid this by keeping two NO_HZ-delta counters and flipping them
 *    when the window starts, thus separating old and new NO_HZ load.
 *
 *    The only trick is the slight shift in index flip for read vs write.
 *
 *        0s            5s            10s           15s
 *          +10           +10           +10           +10
 *        |-|-----------|-|-----------|-|-----------|-|
 *    r:0 0 1           1 0           0 1           1 0
 *    w:0 1 1           0 0           1 1           0 0
 *
 *    This ensures we'll fold the old NO_HZ contribution in this window while
 *    accumlating the new one.
 *
 *  - When we wake up from NO_HZ during the window, we push up our
 *    contribution, since we effectively move our sample point to a known
 *    busy state.
 *
 *    This is solved by pushing the window forward, and thus skipping the
 *    sample, for this CPU (effectively using the NO_HZ-delta for this CPU which
 *    was in effect at the time the window opened). This also solves the issue
 *    of having to deal with a CPU having been in NO_HZ for multiple LOAD_FREQ
 *    intervals.
 *
 * When making the ILB scale, we should try to pull this in as well.
 */
static atomic_long_t calc_load_nohz[2];
static int calc_load_idx;

(3)計算1分鐘的系統負載。5分鐘和15分鐘的負載也是類似計算方法

/*
 * a1 = a0 * e + a * (1 - e)
 */
static inline unsigned long
calc_load(unsigned long load, unsigned long exp, unsigned long active)
{
    unsigned long newload;

    newload = load * exp + active * (FIXED_1 - exp);
    if (active >= load)
        newload += FIXED_1-1;

    return newload / FIXED_1;
}

#define FSHIFT        11        /* nr of bits of precision */
#define FIXED_1        (1<<FSHIFT)    /* 1.0 as fixed-point */
#define LOAD_FREQ    (5*HZ+1)    /* 5 sec intervals */
#define EXP_1        1884        /* 1/exp(5sec/1min) as fixed-point */
#define EXP_5        2014        /* 1/exp(5sec/5min) */
#define EXP_15        2037        /* 1/exp(5sec/15min) */

核心演算法calc_load()的思想是：old_load * 老化係數 + new_load *（1 - 老化係數）

1分鐘負載計算公式：

old_load * (EXP_1/FIXED_1) + new_load * (1 - EXP_1/FIXED_1)

即：

其中,

FIXED_1 = 2^11 = 2048

EXP_1 = 1884

EXP_5 = 2014

EXP_15 = 2037

5分鐘和15分鐘的計算，只需要將公式中的EXP_1換成EXP_5/EXP_15即可。

從計算來看，系統負載本質實際就是統計nr_running + uninterruptible task數量。

(4)由於NO_HZ可能導致已經錯過了多個tick，所以需要將錯過的這些tick也考慮在內。根據實際錯過的具體tick數，重新計算出準確的負載load

/*
 * NO_HZ can leave us missing all per-CPU ticks calling
 * calc_load_fold_active(), but since a NO_HZ CPU folds its delta into
 * calc_load_nohz per calc_load_nohz_start(), all we need to do is fold
 * in the pending NO_HZ delta if our NO_HZ period crossed a load cycle boundary.
 *
 * Once we've updated the global active value, we need to apply the exponential
 * weights adjusted to the number of cycles missed.
 */
static void calc_global_nohz(void)
{
    unsigned long sample_window;
    long delta, active, n;

    sample_window = READ_ONCE(calc_load_update);
    if (!time_before(jiffies, sample_window + 10)) {
        /*
         * Catch-up, fold however many we are behind still
         */
        delta = jiffies - sample_window - 10;
        n = 1 + (delta / LOAD_FREQ);

        active = atomic_long_read(&calc_load_tasks);
        active = active > 0 ? active * FIXED_1 : 0;

        avenrun[0] = calc_load_n(avenrun[0], EXP_1, active, n);
        avenrun[1] = calc_load_n(avenrun[1], EXP_5, active, n);
        avenrun[2] = calc_load_n(avenrun[2], EXP_15, active, n);

        WRITE_ONCE(calc_load_update, sample_window + n * LOAD_FREQ);
    }

    /*
     * Flip the NO_HZ index...
     *
     * Make sure we first write the new time then flip the index, so that
     * calc_load_write_idx() will see the new time when it reads the new
     * index, this avoids a double flip messing things up.
     */
    smp_wmb();
    calc_load_idx++;

通過cat proc/loadavg，可以檢視系統負載結果：

程式碼如下：

static int loadavg_proc_show(struct seq_file *m, void *v)
{
    unsigned long avnrun[3];

    get_avenrun(avnrun, FIXED_1/200, 0);

    seq_printf(m, "%lu.%02lu %lu.%02lu %lu.%02lu %ld/%d %d\n",
        LOAD_INT(avnrun[0]), LOAD_FRAC(avnrun[0]),
        LOAD_INT(avnrun[1]), LOAD_FRAC(avnrun[1]),
        LOAD_INT(avnrun[2]), LOAD_FRAC(avnrun[2]),
        nr_running(), nr_threads,
        idr_get_cursor(&task_active_pid_ns(current)->idr) - 1);
    return 0;
}

總結

1、cpu負載計算，在每個scheduler_tick中觸發。
統計的資料是cpu rq的runnable_load_avg，使用的公式是：當前load = 更新系數 *舊的load+ （1-更新系數）* 新的load ---更新系數為(2^idx - 1) / 2^idx，idx = 0，1，2，3，4。
同時會統計出5條不同週期的CPU負載均線，來用於不同場景下的cpu負載體現和比較基準，並能體現其變化趨勢。

2、系統負載計算，是在scheduler_tick中統計runnable + uninterruptible的task數量（統計間隔5s）。在sched tick timer觸發時（統計間隔5s + 10 tick），計算系統負載。
統計的資料是runnable + uninterruptible的task數量，使用的公式是：舊load * 老化係數+新load * (1 - 老化係數)。
同時會統計1分鐘、5分鐘、15分鐘的系統負載資料。可以通過節點檢視得知。

PS：乍一看2個負載的計算公式都差不多，其實統計的load天差地別，需要注意區別！

參考：https://blog.csdn.net/pwl999/article/details/78817902

https://blog.csdn.net/wukongmingjing/article/details/82531950

cpu負載和系統負載計算原理

CPU負載計算原理

tickless_load

CPU負載為何如此設計？

系統負載計算原理

總結

cpu負載和系統負載計算原理

如何正確理解 CPU 使用率和平均負載的關係？看完你就知道了

效能基礎之理解Linux系統平均負載和CPU使用率

獲取並檢查系統負載\CPU\記憶體\磁碟\網路

Arm 釋出全面計算解決方案：基於 Armv9 架構，包含 CPU、GPU 和系統 IP

I/O密集和CPU密集型的工作負載之間有什麼區別

理解CPU負載和CPU使用率

tomcat cpu佔用過高，系統負載高問題跟蹤

LVS-DR實戰：搭建HTTP和HTTPS負載均衡叢集

天融信TopApp-LB 負載均衡系統 SQL注入

linux檢視系統負載情況

Linux 效能調優CPU篇：平均負載與CPU使用率

Linux應用服務導致系統負載過高問題排查

redis持久化時導致系統負載陡增陡降

計算機組成原理複習提綱六（CPU結構和功能）

吉利汽車公開自動駕駛專利：可使系統負載更加均衡

linux中uptime命令檢視linux系統負載

使用 K6 來給你的服務做一次負載和壓力測試吧

NGINX、HAProxy和Traefik負載均衡能力對比

Tomcat和nginx負載均衡演算法

cpu負載和系統負載計算原理

CPU負載計算原理

tickless_load

CPU負載為何如此設計？

系統負載計算原理

總結

相關推薦