1. 程式人生 > >Python隨筆(三)虛擬機器執行原理

Python隨筆(三)虛擬機器執行原理

開發十年,就只剩下這套架構體系了! >>>   

說到Python的執行機制,就不得不從.pyc檔案和位元組碼說起 PyCodeObject物件的建立時機是模組載入的時候,即import。

.pyc檔案

  1. 執行 python test.py 會對test.py進行編譯成位元組碼並解釋執行,但不會生成test.pyc
  2. 如果test.py中載入了其他模組,如import urllib2,那麼python會對urllib2.py進行編譯成位元組碼,生成urllib2.pyc,然後對位元組碼解釋執行。
  3. 如果想生成test.pyc,我們可以使用python內建模組py_compile來編譯。也可以執行命令 python -m py_compile test.py 這樣,就生成了test.pyc
  4. 載入模組時,如果同時存在.py和.pyc,python會使用.pyc執行,如果.pyc的編譯時間早於.py的時間,則重新編譯 .py檔案,並更新.pyc檔案。

PyCodeObject

Python程式碼的編譯過程就是編譯出PyCodeObject物件 下面是Python3.5.7的PyCodeObject定義

/* Bytecode object */
typedef struct {
    PyObject_HEAD
    int co_argcount;		/* #arguments, except *args CodeBlock中位置引數的個數 */
    int co_kwonlyargcount;	/* #keyword only arguments */
    int co_nlocals;		/* #local variables */
    int co_stacksize;		/* #entries needed for evaluation stack */
    int co_flags;		/* CO_..., see below */
    PyObject *co_code;		/* instruction opcodes */
    PyObject *co_consts;	/* list (constants used) */
    PyObject *co_names;		/* list of strings (names used) */
    PyObject *co_varnames;	/* tuple of strings (local variable names) */
    PyObject *co_freevars;	/* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest aren't used in either hash or comparisons, except for
       co_name (used in both) and co_firstlineno (used only in
       comparisons).  This is done to preserve the name and line number
       for tracebacks and debuggers; otherwise, constant de-duplication
       would collapse identical functions/lambdas defined on different lines.
    */
    unsigned char *co_cell2arg; /* Maps cell vars which are arguments. */
    PyObject *co_filename;	/* unicode (where it was loaded from) */
    PyObject *co_name;		/* unicode (name, for reference) */
    int co_firstlineno;		/* first source line number */
    PyObject *co_lnotab;	/* string (encoding addr<->lineno mapping) See
				   Objects/lnotab_notes.txt for details. */
    void *co_zombieframe;     /* for optimization only (see frameobject.c) */
    PyObject *co_weakreflist;   /* to support weakrefs to code objects */
} PyCodeObject;
  1. co_argcount、co_kwonlyargcount

PEP 3102:http://www.python.org/dev/peps/pep-3102/

Keyword-only argument:在函式引數列表中,出現在*varargs之後的命名引數只能使用關鍵引數的形式呼叫。

函式呼叫是引數的賦值順序:位置引數-->關鍵字引數-->可變引數(*varargs)

co_argcount:CodeBlock中位置引數的個數,即:在呼叫時出現的位置引數的個數(不包含可變引數*varargs)。

co_kwonlyargcount:CodeBlock中的關鍵引數的個數,即在呼叫時是出現在可變引數(*varargs)之後的引數個數,可變引數之後的引數均是形式為“keyvalue”的關鍵引數。

>>> def func(a, b, *d, c):
...     m = 1
...     pass
...
>>> func.__code__.co_argcount
>>> func.__code__.co_kwonlyargcount
  1. co_nlocals:Code Block中的所有區域性變數的個數,包括code block的引數(co_argcount+co_kwonlyargcount+可變引數個數)+code block內的區域性變數
>>> f1.__code__.co_nlocals
7

a、b、c、m,4個

  1. co_stacksize:執行該段Code Block需要的棧空間數
>>> f1.__code__.co_stacksize
1
  1. co_code:Code Block編譯所得的位元組碼指令序列
>>> f1.__code__.co_code
b'd\x01\x00}\x06\x00d\x00\x00S'
  1. co_consts、co_names
  • co_consts:Code Block中的所有常量的元組
  • co_names:Code Block中的所有符號(名字)的元組
>>> f1.__code__.co_consts
(None, 1)
>>> f1.__code__.co_names
()
  1. co_filename、co_name
  • co_filename:Code Block所對應的的.py檔案的完整路徑
  • co_name:Code Block的名字,,通常是函式名或類名
>>> f1.__code__.co_filename
'<stdin>'#因為是在控制檯裡面所以是stdin
>>> f1.__code__.co_name
'f1'
  1. co_firstlineno:Code Block在對應的.py檔案中的起始行 test_1.py
def func(a, b, c, *d, e, f):
    m = 1
    pass
print(f1.__code__.co_firstlineno)

輸出

1
  1. co_varnames、co_freevars、co_cellvars
  • co_varnames:在本程式碼段中被賦值,但沒有被內層程式碼段引用的變數
  • co_freevars(freevars:自由變數):在本程式碼段中被引用,在外層程式碼段中被賦值的變數
  • co_cellvars(cellvars:被內層程式碼所約束的變數):在本程式碼段中被賦值,且被內層程式碼段引用的變數

普通函式程式碼段測試

def func(a, b, c, *d, e, f):
    m = 1
    pass

print('co_argcount        :', func.__code__.co_argcount)
print('co_kwonlyargcount  :', func.__code__.co_kwonlyargcount)
print('co_nlocals         :', func.__code__.co_nlocals)
print('co_stacksize       :', func.__code__.co_stacksize)
print('co_flags           :', func.__code__.co_flags)
print('co_code            :', func.__code__.co_code)
print('co_consts          :', func.__code__.co_consts)
print('co_names           :', func.__code__.co_names)
print('co_varnames        :', func.__code__.co_varnames)
print('co_freevars        :', func.__code__.co_freevars)
print('co_cellvars        :', func.__code__.co_cellvars)
print('co_filename        :', func.__code__.co_filename)
print('co_name            :', func.__code__.co_name)
print('co_firstlineno     :', func.__code__.co_firstlineno)
print('co_lnotab          :', func.__code__.co_lnotab)

輸出

co_argcount        : 3
co_kwonlyargcount  : 2
co_nlocals         : 7
co_stacksize       : 1
co_flags           : 71
co_code            : b'd\x01\x00}\x06\x00d\x00\x00S'
co_consts          : (None, 1)
co_names           : ()
co_varnames        : ('a', 'b', 'c', 'e', 'f', 'd', 'm')
co_freevars        : ()
co_cellvars        : ()
co_filename        : pyvm_test2_function.py
co_name            : func
co_firstlineno     : 1
co_lnotab          : b'\x00\x01\x06\x01'

巢狀函式程式碼測試:

def func(a, b, c, *d, e, f):
    m = 1
    def wapper():
        n = m
    print('wapper-->co_argcount        :', wapper.__code__.co_argcount)
    print('wapper-->co_kwonlyargcount  :', wapper.__code__.co_kwonlyargcount)
    print('wapper-->co_nlocals         :', wapper.__code__.co_nlocals)
    print('wapper-->co_stacksize       :', wapper.__code__.co_stacksize)
    print('wapper-->co_flags           :', wapper.__code__.co_flags)
    print('wapper-->co_code            :', wapper.__code__.co_code)
    print('wapper-->co_consts          :', wapper.__code__.co_consts)
    print('wapper-->co_names           :', wapper.__code__.co_names)
    print('wapper-->co_varnames        :', wapper.__code__.co_varnames)
    print('wapper-->co_freevars        :', wapper.__code__.co_freevars)
    print('wapper-->co_cellvars        :', wapper.__code__.co_cellvars)
    print('wapper-->co_filename        :', wapper.__code__.co_filename)
    print('wapper-->co_name            :', wapper.__code__.co_name)
    print('wapper-->co_firstlineno     :', wapper.__code__.co_firstlineno)
    print('wapper-->co_lnotab          :', wapper.__code__.co_lnotab)

print('func-->co_argcount        :', func.__code__.co_argcount)
print('func-->co_kwonlyargcount  :', func.__code__.co_kwonlyargcount)
print('func-->co_nlocals         :', func.__code__.co_nlocals)
print('func-->co_stacksize       :', func.__code__.co_stacksize)
print('func-->co_flags           :', func.__code__.co_flags)
print('func-->co_code            :', func.__code__.co_code)
print('func-->co_consts          :', func.__code__.co_consts)
print('func-->co_names           :', func.__code__.co_names)
print('func-->co_varnames        :', func.__code__.co_varnames)
print('func-->co_freevars        :', func.__code__.co_freevars)
print('func-->co_cellvars        :', func.__code__.co_cellvars)
print('func-->co_filename        :', func.__code__.co_filename)
print('func-->co_name            :', func.__code__.co_name)
print('func-->co_firstlineno     :', func.__code__.co_firstlineno)
print('func-->co_lnotab          :', func.__code__.co_lnotab)
print('=========================================================')
func(1, 2, 3, 4, 5, 6, 7, e = 8, f = 9)

輸出

func-->co_argcount        : 3
func-->co_kwonlyargcount  : 2
func-->co_nlocals         : 7
func-->co_stacksize       : 3
func-->co_flags           : 7
func-->co_code            : b'd\x01\x00\x89\x00\x00\x87\x00\x00f\x01\x00d\x02\x00d\x03\x00\x86\x00\x00}\x06\x00t\x00\x00d\x04\x00|\x06\x00j\x01\x00j\x02\x00\x83\x02\x00\x01t\x00\x00d\x05\x00|\x06\x00j\x01\x00j\x03\x00\x83\x02\x00\x01t\x00\x00d\x06\x00|\x06\x00j\x01\x00j\x04\x00\x83\x02\x00\x01t\x00\x00d\x07\x00|\x06\x00j\x01\x00j\x05\x00\x83\x02\x00\x01t\x00\x00d\x08\x00|\x06\x00j\x01\x00j\x06\x00\x83\x02\x00\x01t\x00\x00d\t\x00|\x06\x00j\x01\x00j\x07\x00\x83\x02\x00\x01t\x00\x00d\n\x00|\x06\x00j\x01\x00j\x08\x00\x83\x02\x00\x01t\x00\x00d\x0b\x00|\x06\x00j\x01\x00j\t\x00\x83\x02\x00\x01t\x00\x00d\x0c\x00|\x06\x00j\x01\x00j\n\x00\x83\x02\x00\x01t\x00\x00d\r\x00|\x06\x00j\x01\x00j\x0b\x00\x83\x02\x00\x01t\x00\x00d\x0e\x00|\x06\x00j\x01\x00j\x0c\x00\x83\x02\x00\x01t\x00\x00d\x0f\x00|\x06\x00j\x01\x00j\r\x00\x83\x02\x00\x01t\x00\x00d\x10\x00|\x06\x00j\x01\x00j\x0e\x00\x83\x02\x00\x01t\x00\x00d\x11\x00|\x06\x00j\x01\x00j\x0f\x00\x83\x02\x00\x01t\x00\x00d\x12\x00|\x06\x00j\x01\x00j\x10\x00\x83\x02\x00\x01d\x00\x00S'
func-->co_consts          : (None, 1, <code object wapper at 0x000002A033189B70, file "pyvm_test3_function.py", line 3>, 'func.<locals>.wapper', 'wapper-->co_argcount        :', 'wapper-->co_kwonlyargcount  :', 'wapper-->co_nlocals         :', 'wapper-->co_stacksize       :', 'wapper-->co_flags           :', 'wapper-->co_code            :', 'wapper-->co_consts          :', 'wapper-->co_names           :', 'wapper-->co_varnames        :', 'wapper-->co_freevars        :', 'wapper-->co_cellvars        :', 'wapper-->co_filename        :', 'wapper-->co_name            :', 'wapper-->co_firstlineno     :', 'wapper-->co_lnotab          :')
func-->co_names           : ('print', '__code__', 'co_argcount', 'co_kwonlyargcount', 'co_nlocals', 'co_stacksize', 'co_flags', 'co_code', 'co_consts', 'co_names', 'co_varnames', 'co_freevars', 'co_cellvars', 'co_filename', 'co_name', 'co_firstlineno', 'co_lnotab')
func-->co_varnames        : ('a', 'b', 'c', 'e', 'f', 'd', 'wapper')
func-->co_freevars        : ()
func-->co_cellvars        : ('m',)
func-->co_filename        : pyvm_test3_function.py
func-->co_name            : func
func-->co_firstlineno     : 1
func-->co_lnotab          : b'\x00\x01\x06\x01\x12\x02\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01'
=========================================================
wapper-->co_argcount        : 0
wapper-->co_kwonlyargcount  : 0
wapper-->co_nlocals         : 1
wapper-->co_stacksize       : 1
wapper-->co_flags           : 19
wapper-->co_code            : b'\x88\x00\x00}\x00\x00d\x00\x00S'
wapper-->co_consts          : (None,)
wapper-->co_names           : ()
wapper-->co_varnames        : ('n',)
wapper-->co_freevars        : ('m',)
wapper-->co_cellvars        : ()
wapper-->co_filename        : pyvm_test3_function.py
wapper-->co_name            : wapper
wapper-->co_firstlineno     : 3
wapper-->co_lnotab          : b'\x00\x01'

閉包函式測試:
輸出:

func-->co_argcount        : 3
func-->co_kwonlyargcount  : 2
func-->co_nlocals         : 7
func-->co_stacksize       : 3
func-->co_flags           : 7
func-->co_code            : b'd\x01\x00\x89\x00\x00\x87\x00\x00f\x01\x00d\x02\x00d\x03\x00\x86\x00\x00}\x06\x00|\x06\x00S'
func-->co_consts          : (None, 1, <code object wapper at 0x0000019920289B70, file "pyvm_test4_function.py", line 3>, 'func.<locals>.wapper')
func-->co_names           : ()
func-->co_varnames        : ('a', 'b', 'c', 'e', 'f', 'd', 'wapper')
func-->co_freevars        : ()
func-->co_cellvars        : ('m',)
func-->co_filename        : pyvm_test4_function.py
func-->co_name            : func
func-->co_firstlineno     : 1
func-->co_lnotab          : b'\x00\x01\x06\x01\x12\x02'
=========================================================
f3-->co_argcount        : 0
f3-->co_kwonlyargcount  : 0
f3-->co_nlocals         : 1
f3-->co_stacksize       : 1
f3-->co_flags           : 19
f3-->co_code            : b'\x88\x00\x00}\x00\x00d\x00\x00S'
f3-->co_consts          : (None,)
f3-->co_names           : ()
f3-->co_varnames        : ('n',)
f3-->co_freevars        : ('m',)
f3-->co_cellvars        : ()
f3-->co_filename        : pyvm_test4_function.py
f3-->co_name            : wapper
f3-->co_firstlineno     : 3
f3-->co_lnotab          : b'\x00\x01'

(9)co_lnotab:位元組碼指令與.pyc檔案中的source code行號的對於關係

Object/lnotab_notes.txt:
All about co_lnotab, the line number table.
Code objects store a field named co_lnotab. This is an array > of unsigned bytes disguised as a Python string. It is used to map bytecode offsets to source code line #s for tracebacks and to identify line number boundaries for line tracing.
The array is conceptually a compressed list of (bytecode > offset increment, line number increment) pairs. The details > are important and delicate, best illustrated by example:

byte code offsetsource code line number
01
62
507
350307
361308

Instead of storing these numbers literally, we compress the list by storing only the increments from one row to the next. Conceptually, the stored list might look like:

0, 1, 6, 1, 44, 5, 300, 300, 11, 1 形成的陣列:0, 1, (0+6), (1+1), (6+44), (2+5), (50+300), (7+300), (350+11), (307+1)

參考文獻:

[python虛擬機器執行原理]https://www.cnblogs.com/webber19