[Go語言]我的效能我做主(1)

阿新 • • 發佈：2019-01-23

對於一些服務來說，效能是極其重要的一環，事關係統的吞吐、訪問的延遲，進而影響使用者的體驗。

寫效能測試在Go語言中是很便捷的，go自帶的標準工具鏈就有完善的支援，下面我們來從Go的內部和系統呼叫方面來詳細剖析一下Benchmark這塊兒。

benchmark

Go做Benchmar只要在目錄下建立一個_test.go字尾的檔案，然後新增下面函式：

func BenchmarkStringJoin1(b *testing.B) {
    b.ReportAllocs()
    input := []string{"Hello", "World"}
    for i := 0; i < b.N; i++ {
        result 
 := strings.Join(input, " ")
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

呼叫以下命令:
# go test -run=xxx -bench=. -benchtime="3s" -cpuprofile profile_cpu.out

該命令會跳過單元測試，執行所有benchmark,同時生成一個cpu效能描述檔案.

有兩個注意點:
1. -benchtime 可以控制benchmark的執行時間
2. b.ReportAllocs() ，在report中包含記憶體分配資訊，例如結果是:
BenchmarkStringJoin1-4 300000 4351 ns/op 32 B/op 2 allocs/op

-4表示4個CPU執行緒執行；300000表示總共執行了30萬次；4531ns/op，表示每次執行耗時4531納秒；32B/op表示每次執行分配了32位元組記憶體；2 allocs/op表示每次執行分配了2次物件。

根據上面的資訊，我們就能對熱點路徑進行記憶體物件分配的優化。

例如針對上面的程式我們可以進行小小的優化:

func BenchmarkStringJoin2(b *testing.B) {
    b.ReportAllocs()
    input := []string{"Hello", "World"}
    join := func(strs []string, delim string 
) string {
        if len(strs) == 2 {
            return strs[0] + delim + strs[1];
        }
        return "";
    };
    for i := 0; i < b.N; i++ {
        result := join(input, " ")
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

新的Benchmark結果是：
BenchmarkStringJoin2-4 500000 2440 ns/op 16 B/op 1 allocs/op
可以看出來，在減少了記憶體分配後，效能提升了60%以上！

cpu profile

上一節的benchmark結果，我們只能看到函式的整體效能，但是如果該函式較為複雜呢？然後我們又想知道函式內部的耗時，這時就該Cpu Profile登場了。

Cpu profile是Go語言工具鏈中最閃耀的部分之一，掌握了它以及memory、block profile,那基本上就沒有你發現不了的效能瓶頸了。

之前的benchmark同時還生成了一個profile_cpu.out檔案，這裡我們執行下面的命令:

# go tool pprof app.test profile_cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top10
8220ms of 10360ms total (79.34%)
Dropped 63 nodes (cum <= 51.80ms)
Showing top 10 nodes out of 54 (cum >= 160ms)
      flat  flat%   sum%        cum   cum%
    2410ms 23.26% 23.26%     4960ms 47.88%  runtime.concatstrings
    2180ms 21.04% 44.31%     2680ms 25.87%  runtime.mallocgc
    1200ms 11.58% 55.89%     1200ms 11.58%  runtime.memmove
     530ms  5.12% 61.00%      530ms  5.12%  runtime.memeqbody
     530ms  5.12% 66.12%     2540ms 24.52%  runtime.rawstringtmp
     470ms  4.54% 70.66%     2420ms 23.36%  strings.Join
     390ms  3.76% 74.42%     2330ms 22.49%  app.BenchmarkStringJoin3B
     180ms  1.74% 76.16%     1970ms 19.02%  runtime.rawstring
     170ms  1.64% 77.80%     5130ms 49.52%  runtime.concatstring3
     160ms  1.54% 79.34%      160ms  1.54%  runtime.eqstring

上面僅僅展示部分函式的資訊，並沒有呼叫鏈路的效能分析，因此如果需要完整資訊，我們要生成svg或者pdf圖。

# go tool pprof -svg profile_cpu.out > profile_cpu.svg
# go tool pprof -pdf profile_cpu.out > profile_cpu.pdf

下面是profile_cpu.pdf的圖:
這裡寫圖片描述

可以看到圖裡包含了多個benchmark的合集(之前的兩段benmark函式都在同一個檔案中)，但是我們只關心效能最差的那個benchmark,因此需要過濾：

go test -run=xxx -bench=BenchmarkStringJoin2B$ -cpuprofile profile_2b.out
go test -run=xxx -bench=BenchmarkStringJoin2$ -cpuprofile profile_2.out
go tool pprof -svg profile_2b.out > profile_2b.svg
go tool pprof -svg profile_2.out > profile_2.svg

這裡寫圖片描述

根據圖片展示，benchmark自身的函式(迴圈之外的函式)runtime.concatstrings觸發了記憶體物件的分配，造成了耗時，但是跟蹤到這裡，我們已經無法繼續下去了，因此下面就需要flame graphs 了。

“A flame graph is a good way to drill down your benchmarks, finding your bottlenecks #golang” via @TitPetric

這裡寫圖片描述

如果想詳細檢視，你只要點選這些矩形塊就好。這裡寫圖片描述

#!/bin/bash
# install flamegraph scripts
if [ ! -d "/opt/flamegraph" ]; then
    echo "Installing flamegraph (git clone)"
    git clone --depth=1 https://github.com/brendangregg/FlameGraph.git /opt/flamegraph
fi

# install go-torch using docker
if [ ! -f "bin/go-torch" ]; then
    echo "Installing go-torch via docker"
    docker run --net=party --rm=true -it -v $(pwd)/bin:/go/bin golang go get github.com/uber/go-torch
    # or if you have go installed locally: go get github.com/uber/go-torch
fi

PATH="$PATH:/opt/flamegraph"
bin/go-torch -b profile_cpu.out -f profile_cpu.torch.svg

至此，我們的benchmark之路就告一段落，但是上面所述的cpu profile不僅僅能用在benchmark中，還能直接線上debug生產環境的應用效能，具體的就不詳細展開，該系列後續文章會專門講解，下面是本文完整的benchmark程式碼

package main

import "testing"
import "strings"

func BenchmarkStringJoin1(b *testing.B) {
    b.ReportAllocs()
    input := []string{"Hello", "World"}
    for i := 0; i < b.N; i++ {
        result := strings.Join(input, " ")
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

func BenchmarkStringJoin1B(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        input := []string{"Hello", "World"}
        result := strings.Join(input, " ")
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

func BenchmarkStringJoin2(b *testing.B) {
    b.ReportAllocs()
    input := []string{"Hello", "World"}
    join := func(strs []string, delim string) string {
        if len(strs) == 2 {
            return strs[0] + delim + strs[1];
        }
        return "";
    };
    for i := 0; i < b.N; i++ {
        result := join(input, " ")
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

func BenchmarkStringJoin2B(b *testing.B) {
    b.ReportAllocs()
    join := func(strs []string, delim string) string {
        if len(strs) == 2 {
            return strs[0] + delim + strs[1];
        }
        return "";
    };
    for i := 0; i < b.N; i++ {
        input := []string{"Hello", "World"}
        result := join(input, " ")
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

func BenchmarkStringJoin3(b *testing.B) {
    b.ReportAllocs()
    input := []string{"Hello", "World"}
    for i := 0; i < b.N; i++ {
        result := input[0] + " " + input[1];
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

func BenchmarkStringJoin3B(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        input := []string{"Hello", "World"}
        result := input[0] + " " + input[1];
        if result != "Hello World" {
            b.Error("Unexpected result: " + result)
        }
    }
}

[Go語言]我的效能我做主(1)

benchmark

cpu profile

go 語言基礎類型（1）

go語言快速入門：簡介(1)

Go 語言 Excel 類庫 Excelize 1.4.1 版本釋出

go語言LeeCode刷題記：1. 兩數之和

Go語言核心36講筆記1——工作區和GOPATH

Go語言之從0到1實現一個簡單的Redis連線池

[Go語言]我的效能我做主(1)

golang實戰使用gin+xorm搭建go語言web框架restgo詳解1.2 我要做什麼

Go語言的9大優勢和3大缺點,　GO語言最初的定位就是互聯網時代的C語言,　我為什麽放棄Go語言

我是陣列--就要學習Go語言

Go語言連接Oracle（就我這個最全）

我為什麼用GO語言來做區塊鏈？

為什麼我堅持用Go語言做Web應用開發框架？

我為什麼要學Go語言

我的Go語言學習之旅二：入門初體驗 Hello World

我為什麼選擇go語言

我為什麽選擇go語言

基於 Web 的 Go 語言 IDE - Wide 1.1.0 公布！

跟我學習dubbo-簡介(1)

Go 語言集成開發環境 GoLand 更新至 2018.1.3 版本

[Go語言]我的效能我做主(1)

benchmark

cpu profile

相關推薦