1. 程式人生 > 實用技巧 >[擴充套件閱讀] timeit 模組詳解(準確測量小段程式碼的執行時間)

[擴充套件閱讀] timeit 模組詳解(準確測量小段程式碼的執行時間)

timeit 模組詳解 -- 準確測量小段程式碼的執行時間

轉載、摘抄、修改來自於小甲魚零基礎入門學習Python,連結:https://fishc.com.cn/thread-55593-1-1.html

本文主要為摘抄,記錄一些自己的學習過程,如有侵權,請及時聯絡刪除。

timeit 模組提供了測量 Python 小段程式碼執行時間的方法。它既可以在命令列介面直接使用,也可以通過匯入模組進行呼叫該模組靈活地避開了測量執行時間所容易出現的錯誤

以下例子是命令列介面的使用方法:

$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python 
-m timeit '"-".join([str(n) for n in range(100)])' 10000 loops, best of 3: 33.4 usec per loop $ python -m timeit '"-".join(map(str, range(100)))' 10000 loops, best of 3: 25.2 usec per loop

以下例子是 IDLE 下呼叫的方法:

>>> import timeit
>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.8187260627746582 >>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000) 0.7288308143615723 >>> timeit.timeit('"-".join(map(str, range(100)))', number=10000) #map是將第一個引數的函式,分別第二個引數實施 0.5858950614929199

需要注意的是,只有當使用命令列介面時,timeit 才會自動確定重複的次數

timeit 模組

該模組定義了三個實用函式和一個公共類。

timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000)

建立一個 Timer 例項,引數分別是 stmt(需要測量的語句或函式)setup(初始化程式碼或構建環境的匯入語句),timer(計時函式),number(每一次測量中語句被執行的次數

注:由於 timeit() 正在執行語句,語句中如果存在返回值的話會阻止 timeit() 返回執行時間。timeit() 會取代原語句中的返回值。

timeit.repeat(stmt='pass', setup='pass', timer=<default timer>, repeat=3, number=1000000)

建立一個 Timer 例項,引數分別是 stmt(需要測量的語句或函式),setup(初始化程式碼或構建環境的匯入語句),timer(計時函式),repeat(重複測量的次數),number(每一次測量中語句被執行的次數)

timeit.default_timer()

預設的計時器,一般是 time.perf_counter(),time.perf_counter() 方法能夠在任一平臺提供最高精度的計時器(它也只是記錄了自然時間,記錄自然時間會被很多其他因素影響,例如計算機的負載)。

class timeit.Timer(stmt='pass', setup='pass', timer=<timer function>)

計算小段程式碼執行速度的類,建構函式需要的引數有 stmt(需要測量的語句或函式),setup(初始化程式碼或構建環境的匯入語句),timer(計時函式)。前兩個引數的預設值都是 'pass',timer 引數是平臺相關的;前兩個引數都可以包含多個語句,多個語句間使用分號(;)或新行分隔開。

第一次測試語句的時間,可以使用 timeit() 方法;repeat() 方法相當於持續多次呼叫 timeit() 方法並將結果返回為一個列表

stmt 和 setup 引數也可以是可供呼叫但沒有引數的物件,這將會在一個計時函式中巢狀呼叫它們,然後被 timeit() 所執行。注意,由於額外的呼叫,計時開銷會相對略高。

下面是類的一些功能

- timeit(number=1000000)

功能:計算語句執行 number 次的時間。

它會先執行一次 setup 引數的語句,然後計算 stmt 引數的語句執行 number 次的時間,返回值是以秒為單位的浮點數。number 引數的預設值是一百萬,stmt、setup 和 timer 引數由 timeit.Timer 類的建構函式傳遞。

注意:預設情況下,timeit() 在計時的時候會暫時關閉 Python 的垃圾回收機制。這樣做的優點是計時結果更具有可比性,但缺點是 GC(garbage collection,垃圾回收機制的縮寫)有時候是測量函式效能的一個重要組成部分。如果是這樣的話,GC 可以在 setup 引數執行第一條語句的時候被重新啟動,例如:

timeit.Timer('for i in range(10): oct(i)', 'gc.enable()').timeit()

- repeat(repeat=3, number=1000000)

功能:重複呼叫 timeit()。

repeat() 方法相當於持續多次呼叫 timeit() 方法並將結果返回為一個列表。repeat 引數指定重複的次數,number 引數傳遞給 timeit() 方法的 number 引數。

注意:人們很容易計算出平均值標準偏差,但這並不是非常有用。在典型的情況下,最低值取決於你的機器可以多快地執行給定的程式碼段;在結果中更高的那些值通常不是由於 Python 的速度導致,而是因為其他程序干擾了你的計時精度。所以,你所應感興趣的只有結果的最低值(可以用 min() 求出)。

- print_exc(file=None)

功能:輸出計時程式碼的回溯(Traceback)

典型的用法:

t = Timer(...)       # outside the try/except
try:
    t.timeit(...)    # or t.repeat(...)
except Exception:
    t.print_exc()

標準回溯的優點是在編譯模板中,源語句行會被顯示出來。可選的 file 引數指定將回溯傳送的位置,預設是傳送到 sys.stderr。

命令列介面

當被作為命令列程式呼叫時,可以使用下列選項:

python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...]

各個選項的含義:

選項 原型 含義
-n N --number=N 執行指定語句(段)的次數
-r N --repeat=N 重複測量的次數(預設 3 次)
-s S --setup=S 指定初始化程式碼或構建環境的匯入語句(預設是 pass)
-p --process 測量程序時間而不是實際執行時間(使用 time.process_time() 代替預設的 time.perf_counter())
以下是 Python3.3 新增:
-t --time 使用 time.time()(不推薦)
-c --clock 使用 time.clock()(不推薦)
-v --verbose 列印原始的計時結果,輸出更大精度的數值
-h --help 列印一個簡短的用法資訊並退出

示例

以下演示如果在開始的時候設定初始化語句:

命令列:

$ python -m timeit -s 'text = "I love FishC.com!"; char = "o"'  'char in text'
10000000 loops, best of 3: 0.0877 usec per loop
$ python -m timeit -s 'text = "I love FishC.com!"; char = "o"'  'text.find(char)'
1000000 loops, best of 3: 0.342 usec per loop

使用 timeit 模組:

>>> import timeit
>>> timeit.timeit('char in text', setup='text = "I love FishC.com!"; char = "o"')
0.41440500499993504
>>> timeit.timeit('text.find(char)', setup='text = "I love FishC.com!"; char = "o"')
1.7246671520006203

使用 Timer 物件:

>>> import timeit
>>> t = timeit.Timer('char in text', setup='text = "I love FishC.com!"; char = "o"')
>>> t.timeit()
0.3955516149999312
>>> t.repeat()
[0.40193588800002544, 0.3960157959998014, 0.39594301399984033]

以下演示包含多行語句如何進行測量:

(我們通過 hasattr() 和 try/except 兩種方法測試屬性是否存在,並且比較它們之間的效率)

命令列:

$ python -m timeit 'try:' '  str.__bool__' 'except AttributeError:' '  pass'
100000 loops, best of 3: 15.7 usec per loop
$ python -m timeit 'if hasattr(str, "__bool__"): pass'
100000 loops, best of 3: 4.26 usec per loop

$ python -m timeit 'try:' '  int.__bool__' 'except AttributeError:' '  pass'
1000000 loops, best of 3: 1.43 usec per loop
$ python -m timeit 'if hasattr(int, "__bool__"): pass'
100000 loops, best of 3: 2.23 usec per loop

使用 timeit 模組:

>>> import timeit
>>> # attribute is missing
>>> s = """\
... try:
...     str.__bool__
... except AttributeError:
...     pass
... """
>>> timeit.timeit(stmt=s, number=100000)
0.9138244460009446
>>> s = "if hasattr(str, '__bool__'): pass"
>>> timeit.timeit(stmt=s, number=100000)
0.5829014980008651
>>>
>>> # attribute is present
>>> s = """\
... try:
...     int.__bool__
... except AttributeError:
...     pass
... """
>>> timeit.timeit(stmt=s, number=100000)
0.04215312199994514
>>> s = "if hasattr(int, '__bool__'): pass"
>>> timeit.timeit(stmt=s, number=100000)
0.08588060699912603

為了使 timeit 模組可以測量你的函式,你可以在 setup 引數中通過 import 語句匯入:

def test():
    """Stupid test function"""
    L = [i for i in range(100)]

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("test()", setup="from __main__ import test"))

附上 timeit 模組的實現原始碼,小甲魚強烈建議有時間的朋友可以研究一下(這對你的程式設計能力大有裨益):

#! /usr/bin/env python3

"""Tool for measuring execution time of small code snippets.

This module avoids a number of common traps for measuring execution
times.  See also Tim Peters' introduction to the Algorithms chapter in
the Python Cookbook, published by O'Reilly.

Library usage: see the Timer class.

Command line usage:
    python timeit.py [-n N] [-r N] [-s S] [-t] [-c] [-p] [-h] [--] [statement]

Options:
  -n/--number N: how many times to execute 'statement' (default: see below)
  -r/--repeat N: how many times to repeat the timer (default 3)
  -s/--setup S: statement to be executed once initially (default 'pass')
  -p/--process: use time.process_time() (default is time.perf_counter())
  -t/--time: use time.time() (deprecated)
  -c/--clock: use time.clock() (deprecated)
  -v/--verbose: print raw timing results; repeat for more digits precision
  -h/--help: print this usage message and exit
  --: separate options from statement, use when statement starts with -
  statement: statement to be timed (default 'pass')

A multi-line statement may be given by specifying each line as a
separate argument; indented lines are possible by enclosing an
argument in quotes and using leading spaces.  Multiple -s options are
treated similarly.

If -n is not given, a suitable number of loops is calculated by trying
successive powers of 10 until the total time is at least 0.2 seconds.

Note: there is a certain baseline overhead associated with executing a
pass statement.  It differs between versions.  The code here doesn't try
to hide it, but you should be aware of it.  The baseline overhead can be
measured by invoking the program without arguments.

Classes:

    Timer

Functions:

    timeit(string, string) -> float
    repeat(string, string) -> list
    default_timer() -> float

"""

import gc
import sys
import time
import itertools

__all__ = ["Timer", "timeit", "repeat", "default_timer"]

dummy_src_name = "<timeit-src>"
default_number = 1000000
default_repeat = 3
default_timer = time.perf_counter

# Don't change the indentation of the template; the reindent() calls
# in Timer.__init__() depend on setup being indented 4 spaces and stmt
# being indented 8 spaces.
template = """
def inner(_it, _timer):
    {setup}
    _t0 = _timer()
    for _i in _it:
        {stmt}
    _t1 = _timer()
    return _t1 - _t0
"""

def reindent(src, indent):
    """Helper to reindent a multi-line statement."""
    return src.replace("\n", "\n" + " "*indent)

def _template_func(setup, func):
    """Create a timer function. Used if the "statement" is a callable."""
    def inner(_it, _timer, _func=func):
        setup()
        _t0 = _timer()
        for _i in _it:
            _func()
        _t1 = _timer()
        return _t1 - _t0
    return inner

class Timer:
    """Class for timing execution speed of small code snippets.

    The constructor takes a statement to be timed, an additional
    statement used for setup, and a timer function.  Both statements
    default to 'pass'; the timer function is platform-dependent (see
    module doc string).

    To measure the execution time of the first statement, use the
    timeit() method.  The repeat() method is a convenience to call
    timeit() multiple times and return a list of results.

    The statements may contain newlines, as long as they don't contain
    multi-line string literals.
    """

    def __init__(self, stmt="pass", setup="pass", timer=default_timer):
        """Constructor.  See class doc string."""
        self.timer = timer
        ns = {}
        if isinstance(stmt, str):
            stmt = reindent(stmt, 8)
            if isinstance(setup, str):
                setup = reindent(setup, 4)
                src = template.format(stmt=stmt, setup=setup)
            elif callable(setup):
                src = template.format(stmt=stmt, setup='_setup()')
                ns['_setup'] = setup
            else:
                raise ValueError("setup is neither a string nor callable")
            self.src = src # Save for traceback display
            code = compile(src, dummy_src_name, "exec")
            exec(code, globals(), ns)
            self.inner = ns["inner"]
        elif callable(stmt):
            self.src = None
            if isinstance(setup, str):
                _setup = setup
                def setup():
                    exec(_setup, globals(), ns)
            elif not callable(setup):
                raise ValueError("setup is neither a string nor callable")
            self.inner = _template_func(setup, stmt)
        else:
            raise ValueError("stmt is neither a string nor callable")

    def print_exc(self, file=None):
        """Helper to print a traceback from the timed code.

        Typical use:

            t = Timer(...)       # outside the try/except
            try:
                t.timeit(...)    # or t.repeat(...)
            except:
                t.print_exc()

        The advantage over the standard traceback is that source lines
        in the compiled template will be displayed.

        The optional file argument directs where the traceback is
        sent; it defaults to sys.stderr.
        """
        import linecache, traceback
        if self.src is not None:
            linecache.cache[dummy_src_name] = (len(self.src),
                                               None,
                                               self.src.split("\n"),
                                               dummy_src_name)
        # else the source is already stored somewhere else

        traceback.print_exc(file=file)

    def timeit(self, number=default_number):
        """Time 'number' executions of the main statement.

        To be precise, this executes the setup statement once, and
        then returns the time it takes to execute the main statement
        a number of times, as a float measured in seconds.  The
        argument is the number of times through the loop, defaulting
        to one million.  The main statement, the setup statement and
        the timer function to be used are passed to the constructor.
        """
        it = itertools.repeat(None, number)
        gcold = gc.isenabled()
        gc.disable()
        try:
            timing = self.inner(it, self.timer)
        finally:
            if gcold:
                gc.enable()
        return timing

    def repeat(self, repeat=default_repeat, number=default_number):
        """Call timeit() a few times.

        This is a convenience function that calls the timeit()
        repeatedly, returning a list of results.  The first argument
        specifies how many times to call timeit(), defaulting to 3;
        the second argument specifies the timer argument, defaulting
        to one million.

        Note: it's tempting to calculate mean and standard deviation
        from the result vector and report these.  However, this is not
        very useful.  In a typical case, the lowest value gives a
        lower bound for how fast your machine can run the given code
        snippet; higher values in the result vector are typically not
        caused by variability in Python's speed, but by other
        processes interfering with your timing accuracy.  So the min()
        of the result is probably the only number you should be
        interested in.  After that, you should look at the entire
        vector and apply common sense rather than statistics.
        """
        r = []
        for i in range(repeat):
            t = self.timeit(number)
            r.append(t)
        return r

def timeit(stmt="pass", setup="pass", timer=default_timer,
           number=default_number):
    """Convenience function to create Timer object and call timeit method."""
    return Timer(stmt, setup, timer).timeit(number)

def repeat(stmt="pass", setup="pass", timer=default_timer,
           repeat=default_repeat, number=default_number):
    """Convenience function to create Timer object and call repeat method."""
    return Timer(stmt, setup, timer).repeat(repeat, number)

def main(args=None, *, _wrap_timer=None):
    """Main program, used when run as a script.

    The optional 'args' argument specifies the command line to be parsed,
    defaulting to sys.argv[1:].

    The return value is an exit code to be passed to sys.exit(); it
    may be None to indicate success.

    When an exception happens during timing, a traceback is printed to
    stderr and the return value is 1.  Exceptions at other times
    (including the template compilation) are not caught.

    '_wrap_timer' is an internal interface used for unit testing.  If it
    is not None, it must be a callable that accepts a timer function
    and returns another timer function (used for unit testing).
    """
    if args is None:
        args = sys.argv[1:]
    import getopt
    try:
        opts, args = getopt.getopt(args, "n:s:r:tcpvh",
                                   ["number=", "setup=", "repeat=",
                                    "time", "clock", "process",
                                    "verbose", "help"])
    except getopt.error as err:
        print(err)
        print("use -h/--help for command line help")
        return 2
    timer = default_timer
    stmt = "\n".join(args) or "pass"
    number = 0 # auto-determine
    setup = []
    repeat = default_repeat
    verbose = 0
    precision = 3
    for o, a in opts:
        if o in ("-n", "--number"):
            number = int(a)
        if o in ("-s", "--setup"):
            setup.append(a)
        if o in ("-r", "--repeat"):
            repeat = int(a)
            if repeat <= 0:
                repeat = 1
        if o in ("-t", "--time"):
            timer = time.time
        if o in ("-c", "--clock"):
            timer = time.clock
        if o in ("-p", "--process"):
            timer = time.process_time
        if o in ("-v", "--verbose"):
            if verbose:
                precision += 1
            verbose += 1
        if o in ("-h", "--help"):
            print(__doc__, end=' ')
            return 0
    setup = "\n".join(setup) or "pass"
    # Include the current directory, so that local imports work (sys.path
    # contains the directory of this script, rather than the current
    # directory)
    import os
    sys.path.insert(0, os.curdir)
    if _wrap_timer is not None:
        timer = _wrap_timer(timer)
    t = Timer(stmt, setup, timer)
    if number == 0:
        # determine number so that 0.2 <= total time < 2.0
        for i in range(1, 10):
            number = 10**i
            try:
                x = t.timeit(number)
            except:
                t.print_exc()
                return 1
            if verbose:
                print("%d loops -> %.*g secs" % (number, precision, x))
            if x >= 0.2:
                break
    try:
        r = t.repeat(repeat, number)
    except:
        t.print_exc()
        return 1
    best = min(r)
    if verbose:
        print("raw times:", " ".join(["%.*g" % (precision, x) for x in r]))
    print("%d loops," % number, end=' ')
    usec = best * 1e6 / number
    if usec < 1000:
        print("best of %d: %.*g usec per loop" % (repeat, precision, usec))
    else:
        msec = usec / 1000
        if msec < 1000:
            print("best of %d: %.*g msec per loop" % (repeat, precision, msec))
        else:
            sec = msec / 1000
            print("best of %d: %.*g sec per loop" % (repeat, precision, sec))
    return None

if __name__ == "__main__":
    sys.exit(main())