使用line_profiler對python程式碼效能進行評估優化

阿新 • • 發佈：2021-01-20

效能測試的意義

在做完一個python專案之後，我們經常要考慮對軟體的效能進行優化。那麼我們需要一個軟體優化的思路，首先我們需要明確軟體本身程式碼以及函式的瓶頸，最理想的情況就是有這樣一個工具，能夠將一個目標函式的程式碼每一行的效能都評估出來，這樣我們可以針對所有程式碼中效能最差的那一部分，來進行鍼對性的優化。開源庫line_profiler就做了一個這樣的工作，開源地址：github.com/rkern/line_profiler。下面讓我們一起看下該工具的安裝和使用詳情。

line_profiler的安裝

line_profiler的安裝支援原始碼安裝和pip的安裝，這裡我們僅介紹pip形式的安裝，也比較容易，原始碼安裝方式請參考官方開源地址。

[dechin@dechin-manjaro line_profiler]$ python3 -m pip install line_profiler
Collecting line_profiler
  Downloading line_profiler-3.1.0-cp38-cp38-manylinux2010_x86_64.whl (65 kB)
     |████████████████████████████████| 65 kB 221 kB/s 
Requirement already satisfied: IPython in /home/dechin/anaconda3/lib/python3.8/site-packages (from line_profiler) (7.19.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (3.0.8)
Requirement already satisfied: backcall in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (0.2.0)
Requirement already satisfied: pexpect>4.3; sys_platform != "win32" in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (4.8.0)
Requirement already satisfied: setuptools>=18.5 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (50.3.1.post20201107)
Requirement already satisfied: jedi>=0.10 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (0.17.1)
Requirement already satisfied: decorator in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (4.4.2)
Requirement already satisfied: traitlets>=4.2 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (5.0.5)
Requirement already satisfied: pygments in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (2.7.2)
Requirement already satisfied: pickleshare in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (0.7.5)
Requirement already satisfied: wcwidth in /home/dechin/anaconda3/lib/python3.8/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->IPython->line_profiler) (0.2.5)
Requirement already satisfied: ptyprocess>=0.5 in /home/dechin/anaconda3/lib/python3.8/site-packages (from pexpect>4.3; sys_platform != "win32"->IPython->line_profiler) (0.6.0)
Requirement already satisfied: parso<0.8.0,>=0.7.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from jedi>=0.10->IPython->line_profiler) (0.7.0)
Requirement already satisfied: ipython-genutils in /home/dechin/anaconda3/lib/python3.8/site-packages (from traitlets>=4.2->IPython->line_profiler) (0.2.0)
Installing collected packages: line-profiler
Successfully installed line-profiler-3.1.0

這裡額外介紹一種臨時使用pip的源進行安裝的方案，這裡用到的是騰訊所提供的pypi源：

python3 -m pip install -i https://mirrors.cloud.tencent.com/pypi/simple line_profiler

如果需要永久儲存源可以修改～/.pip/pip.conf檔案，一個參考示例如下（採用華為雲的映象源）：

[global]
index-url = https://mirrors.huaweicloud.com/repository/pypi/simple
trusted-host = mirrors.huaweicloud.com
timeout = 120

在需要除錯優化的程式碼中引用line_profiler

讓我們直接來看一個案例：

# line_profiler_test.py
from line_profiler import LineProfiler
import numpy as np

@profile
def test_profiler():
    for i in range(100):
        a = np.random.randn(100)
        b = np.random.randn(1000)
        c = np.random.randn(10000)
    return None

if __name__ == '__main__':
    test_profiler()

在這個案例中，我們定義了一個需要測試的函式test_profiler，在這個函式中有幾行待分析效能的模組numpy.random.randn。使用的方式就是先import進來LineProfiler函式，然後在需要逐行進行效能分析的函式上方引用名為profile的裝飾器，就完成了line_profiler效能分析的配置。關於python裝飾器的使用和原理，可以參考這篇部落格的內容介紹。還有一點需要注意的是，line_profiler所能夠分析的範圍僅限於加了裝飾器的函式內容，如果函式內有其他的呼叫之類的，不會再進入其他的函式進行分析，除了內嵌的巢狀函式。

使用line_profiler進行簡單效能分析

line_profiler的使用方法也較為簡單，主要就是兩步：先用kernprof解析，再採用python執行得到分析結果。

在定義好需要分析的函式模組之後，用kernprof解析成二進位制lprof檔案：

[dechin-manjaro line_profiler]# kernprof -l line_profiler_test.py 
Wrote profile results to line_profiler_test.py.lprof

該命令執行結束後，會在當前目錄下產生一個lprof檔案：

[dechin-manjaro line_profiler]# ll
總用量 8
-rw-r--r-- 1 dechin dechin 304  1月 20 16:00 line_profiler_test.py
-rw-r--r-- 1 root   root   185  1月 20 16:00 line_profiler_test.py.lprof

使用python3執行lprof二進位制檔案：

[dechin-manjaro line_profiler]# python3 -m line_profiler line_profiler_test.py.lprof 
Timer unit: 1e-06 s

Total time: 0.022633 s
File: line_profiler_test.py
Function: test_profiler at line 5

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                           @profile
     6                                           def test_profiler():
     7       101         40.0      0.4      0.2      for i in range(100):
     8       100        332.0      3.3      1.5          a = np.random.randn(100)
     9       100       2092.0     20.9      9.2          b = np.random.randn(1000)
    10       100      20169.0    201.7     89.1          c = np.random.randn(10000)
    11         1          0.0      0.0      0.0      return None

這裡我們就直接得到了逐行的效能分析結論。簡單介紹一下每一列的含義：程式碼在程式碼檔案中對應的行號、被呼叫的次數、該行的總共執行時間、單次執行所消耗的時間、執行時間在該函式下的佔比，最後一列是具體的程式碼內容。其實，關於line_profiler的使用介紹到這裡就可以結束了，但是我們希望通過另外一個實際案例來分析line_profiler的功能，感興趣的讀者可以繼續往下閱讀。

使用line_profiler分析不同函式庫計算正弦函式sin的效率

我們這裡需要測試多個庫中所實現的正弦函式，其中包含我們自己使用的fortran內建的SIN函式。

在演示line_profiler的效能測試之前，讓我們先看看如何將一個fortran的f90檔案轉換成python可呼叫的動態連結庫檔案。

首先在Manjaro Linux平臺上安裝gfotran

[dechin-manjaro line_profiler]# pacman -S gcc-fortran
正在解析依賴關係...
正在查詢軟體包衝突...

軟體包 (1) gcc-fortran-10.2.0-4

下載大小：   9.44 MiB
全部安裝大小：  31.01 MiB

:: 進行安裝嗎？ [Y/n] Y
:: 正在獲取軟體包......
 gcc-fortran-10.2.0-4-x86_64                                                                                        9.4 MiB  6.70 MiB/s 00:01 [#######################################################################################] 100%
(1/1) 正在檢查金鑰環裡的金鑰                                                                                                                  [#######################################################################################] 100%
(1/1) 正在檢查軟體包完整性                                                                                                                    [#######################################################################################] 100%
(1/1) 正在載入軟體包檔案                                                                                                                      [#######################################################################################] 100%
(1/1) 正在檢查檔案衝突                                                                                                                        [#######################################################################################] 100%
(1/1) 正在檢查可用儲存空間                                                                                                                    [#######################################################################################] 100%
:: 正在處理軟體包的變化...
(1/1) 正在安裝 gcc-fortran                                                                                                                    [#######################################################################################] 100%
:: 正在執行事務後鉤子函式...
(1/2) Arming ConditionNeedsUpdate...
(2/2) Updating the info directory file...

建立一個簡單的fortran檔案fmath.f90，功能為返回正弦函式的值：

subroutine fsin(theta,result)
        implicit none
        real*8::theta
        real*8,intent(out)::result
        result=SIN(theta)
end subroutine

用f2py將該fortran檔案編譯成名為fmath的動態連結庫：

[dechin-manjaro line_profiler]# f2py -c -m fmath fmath.f90 
running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building extension "fmath" sources
f2py options: []
f2py:> /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c
creating /tmp/tmpup5ia9lf/src.linux-x86_64-3.8
Reading fortran codes...
        Reading file 'fmath.f90' (format:free)
Post-processing...
        Block: fmath
                        Block: fsin
Post-processing (stage 2)...
Building modules...
        Building module "fmath"...
                Constructing wrapper function "fsin"...
                  result = fsin(theta)
        Wrote C/API module "fmath" to file "/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c"
  adding '/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.c' to sources.
  adding '/tmp/tmpup5ia9lf/src.linux-x86_64-3.8' to include_dirs.
copying /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmpup5ia9lf/src.linux-x86_64-3.8
copying /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmpup5ia9lf/src.linux-x86_64-3.8
build_src: building npy-pkg config files
running build_ext
customize UnixCCompiler
customize UnixCCompiler using build_ext
get_default_fcompiler: matching types: '['gnu95', 'intel', 'lahey', 'pg', 'absoft', 'nag', 'vast', 'compaq', 'intele', 'intelem', 'gnu', 'g95', 'pathf95', 'nagfor']'
customize Gnu95FCompiler
Found executable /usr/bin/gfortran
customize Gnu95FCompiler
customize Gnu95FCompiler using build_ext
building 'fmath' extension
compiling C sources
C compiler: gcc -pthread -B /home/dechin/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC

creating /tmp/tmpup5ia9lf/tmp
creating /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf
creating /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf/src.linux-x86_64-3.8
compile options: '-I/tmp/tmpup5ia9lf/src.linux-x86_64-3.8 -I/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/dechin/anaconda3/include/python3.8 -c'
gcc: /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c
gcc: /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.c
In file included from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                 from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                 from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.h:13,
                 from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c:15:
/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: 警告：#warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
   17 | #warning "Using deprecated NumPy API, disable it with " \
      |  ^~~~~~~
In file included from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                 from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                 from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.h:13,
                 from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.c:2:
/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: 警告：#warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
   17 | #warning "Using deprecated NumPy API, disable it with " \
      |  ^~~~~~~
compiling Fortran sources
Fortran f77 compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran f90 compiler: /usr/bin/gfortran -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran fix compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
compile options: '-I/tmp/tmpup5ia9lf/src.linux-x86_64-3.8 -I/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/dechin/anaconda3/include/python3.8 -c'
gfortran:f90: fmath.f90
/usr/bin/gfortran -Wall -g -Wall -g -shared /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.o /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.o /tmp/tmpup5ia9lf/fmath.o -L/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../lib -lgfortran -o ./fmath.cpython-38-x86_64-linux-gnu.so
Removing build directory /tmp/tmpup5ia9lf

這中間會有一些告警，但是並不影響我們的正常使用，編譯好之後，可以在當前目錄下看到一個so檔案（如果是windows平臺可能是其他型別的動態連結庫檔案）：

[dechin-manjaro line_profiler]# ll
總用量 120
-rwxr-xr-x 1 root   root   107256  1月 20 16:40 fmath.cpython-38-x86_64-linux-gnu.so
-rw-r--r-- 1 root   root      150  1月 20 16:40 fmath.f90
-rw-r--r-- 1 dechin dechin    304  1月 20 16:00 line_profiler_test.py
-rw-r--r-- 1 root   root      185  1月 20 16:00 line_profiler_test.py.lprof

用ipython測試該動態連結庫的功能是否正常：

[dechin-manjaro line_profiler]# ipython
Python 3.8.5 (default, Sep  4 2020, 07:30:14) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from fmath import fsin

In [2]: print (fsin(3.14))
0.0015926529164868282

In [3]: print (fsin(3.1415926))
5.3589793170057245e-08

這裡我們可以看到基於fortran的正弦函式的功能已經完成實現了，接下來讓我們正式對比幾種正弦函式實現的效能（底層的實現有可能重複，這裡作為黑盒來進行效能測試）。

首先，我們還是需要建立好待測試的python檔案sin_profiler_test.py：

# sin_profiler_test.py
from line_profiler import LineProfiler
import random
from numpy import sin as numpy_sin
from math import sin as math_sin
# from cupy import sin as cupy_sin
from cmath import sin as cmath_sin
from fmath import fsin as fortran_sin

@profile
def test_profiler():
    for i in range(100000):
        r = random.random()
        a = numpy_sin(r)
        b = math_sin(r)
        # c = cupy_sin(r)
        d = cmath_sin(r)
        e = fortran_sin(r)
    return None

if __name__ == '__main__':
    test_profiler()

這裡line_profiler的定義跟前面定義的例子一致，我們主要測試的物件為numpy,math,cmath四個開源庫的正弦函式實現以及自己實現的一個fortran的正弦函式，通過上面介紹的f2py構造的動態連結庫跟python實現無縫對接。由於這裡的cupy庫沒有安裝成功，所以這裡暫時沒辦法測試而註釋掉了。接下來還是一樣的，通過kernprof進行編譯構建：

[dechin-manjaro line_profiler]# kernprof -l sin_profiler_test.py 
Wrote profile results to sin_profiler_test.py.lprof

最後通過python3來執行：

[dechin-manjaro line_profiler]# python3 -m line_profiler sin_profiler_test.py.lprof 
Timer unit: 1e-06 s

Total time: 0.261304 s
File: sin_profiler_test.py
Function: test_profiler at line 10

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    10                                           @profile
    11                                           def test_profiler():
    12    100001      28032.0      0.3     10.7      for i in range(100000):
    13    100000      33995.0      0.3     13.0          r = random.random()
    14    100000      86870.0      0.9     33.2          a = numpy_sin(r)
    15    100000      33374.0      0.3     12.8          b = math_sin(r)
    16                                                   # c = cupy_sin(r)
    17    100000      40179.0      0.4     15.4          d = cmath_sin(r)
    18    100000      38854.0      0.4     14.9          e = fortran_sin(r)
    19         1          0.0      0.0      0.0      return None

從這個結果上我們可以看出，在這測試的四個庫中，math的計算效率是最高的，numpy的計算效率是最低的，而我們自己編寫的fortran介面函式甚至都比numpy的實現快了一倍，僅次於math的實現。其實，這裡值涉及到了單個函式的效能測試，我們還可以通過ipython中自帶的timeit來進行測試：

[dechin-manjaro line_profiler]# ipython
Python 3.8.5 (default, Sep  4 2020, 07:30:14) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from fmath import fsin

In [2]: import random

In [3]: %timeit fsin(random.random())
145 ns ± 2.38 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: from math import sin as math_sin

In [5]: %timeit math_sin(random.random())
107 ns ± 0.116 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: from numpy import sin as numpy_sin

In [7]: %timeit numpy_sin(random.random())
611 ns ± 4.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [8]: from cmath import sin as cmath_sin

In [9]: %timeit cmath_sin(random.random())
151 ns ± 1.01 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

在這個結果中我們看到排名的趨勢依然跟之前的保持一致，但是由於將random模組和計算模組放在一起，在給出的時間數值上有些差異。

總結概要

本文重點介紹了python的一款逐行效能分析的工具line_profiler，通過簡單的裝飾器的呼叫就可以分析出程式的效能瓶頸，從而進行鍼對性的優化。另外，在測試的過程中我們還可以發現，不同形式的正弦三角函式實現，效能是存在差異的，只是在日常使用頻率較低的情況下是不感知的。需要了解的是，即使是正弦函式也有很多不同的實現方案，比如各種級數展開，而目前最流行、效能最高的計算方式，其實還是通過查表法。因此，不同的演算法實現、不同的語言實現，都會導致完全不一樣的結果。就測試情況而言，已知的效能排名為：math<fortran<cmath<numpy從左到右執行時長逐步增加。

版權宣告

本文首發連結為：https://www.cnblogs.com/dechinphy/p/line-profiler.html
作者ID：DechinPhy
更多原著文章請參考：https://www.cnblogs.com/dechinphy/

使用line_profiler對python程式碼效能進行評估優化

效能測試的意義

line_profiler的安裝

在需要除錯優化的程式碼中引用line_profiler

使用line_profiler進行簡單效能分析

使用line_profiler分析不同函式庫計算正弦函式sin的效率

總結概要

版權宣告

使用line_profiler對python程式碼效能進行評估優化

利用Cython對python程式碼進行加密

如何使用Cython對python程式碼進行加密

python裝飾器實現對異常程式碼出現進行自動監控

python裝飾器實現對異常程式碼出現進行自動監控的實現方法

如何對python的字典進行排序

對Python 字典元素進行刪除的方法

分析python程式碼效能的程式分析包cProfile

Python程式碼效能(時間+記憶體)

對於Ext.data.Store 介紹與總結,以及對以前程式碼的重構與優化

使用js陣列map方法對老程式碼進行優化

python 程式碼使用ctypes呼叫C介面實現效能優化的解決方案

A_05 效能調優：採用BenchmarkDotNet對c#程式碼進行基準測試，

iOS 使用WebRTC進行直播推流時，對聲音進行音質優化

使用Gradle對Java程式碼進行開發規範檢查

使用python 對驗證碼圖片進行降噪處理

python對Excel按條件進行內容補充(推薦)

使用python程式碼進行身份證號校驗的實現示例

如何給Python程式碼進行加密

python通過對字典的排序,對json欄位進行排序的例項

使用line_profiler對python程式碼效能進行評估優化

效能測試的意義

line_profiler的安裝

在需要除錯優化的程式碼中引用line_profiler

使用line_profiler進行簡單效能分析

使用line_profiler分析不同函式庫計算正弦函式sin的效率

總結概要

版權宣告

相關推薦