python中物件self的由來
一、使用的例子
這裡使用的例子使用的是https://eli.thegreenplace.net/2012/06/15/under-the-hood-of-python-class-definitions中的方法,這種方法的精妙之處在於把class定義到一個類的內部,從而可以通過__code__.co_consts來把build_class找那個使用的程式碼完整的打印出來。為了避免跳轉,這裡照貓畫虎再把這些內容完整的實現一份
[email protected]: cat classself.py
def tsecer():
class harry():
def fry():
pass
Python 3.6.0 (default, Nov 15 2018, 10:32:57)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis, classself
>>> dis.dis(classself)
Disassembly of tsecer:
2 0 LOAD_BUILD_CLASS
2 LOAD_CONST 1 (<code object harry at 0x7f7b51e3ddc0, file "/data1/harry/work/python/classself/classself.py", line 2>)
4 LOAD_CONST 2 ('harry')
6 MAKE_FUNCTION 0
8 LOAD_CONST 2 ('harry')
10 CALL_FUNCTION 2
12 STORE_FAST 0 (harry)
14 LOAD_CONST 0 (None)
16 RETURN_VALUE
>>> dis.dis(classself.tsecer.__code__.co_consts[1])
2 0 LOAD_NAME 0 (__name__)
2 STORE_NAME 1 (__module__)
4 LOAD_CONST 0 ('tsecer.<locals>.harry')
6 STORE_NAME 2 (__qualname__)
3 8 LOAD_CONST 1 (<code object fry at 0x7f7b51eae640, file "/data1/harry/work/python/classself/classself.py", line 3>)
10 LOAD_CONST 2 ('tsecer.<locals>.harry.fry')
12 MAKE_FUNCTION 0
14 STORE_NAME 3 (fry)
16 LOAD_CONST 3 (None)
18 RETURN_VALUE
>>>
二、LOAD_BUILD_CLASS指令的虛擬機器執行程式碼
這裡要要注意到,PyEval_EvalCodeEx傳入的第三個引數是函式執行之後的locals儲存,也就是把class的定義作為函式呼叫,並把函式的locals儲存在ns中。class的程式碼執行之後,把這個填充的ns作為meta的引數呼叫,這些locals也作為類的attrs來建立新的class型別。
/* AC: cannot convert yet, waiting for *args support */
static PyObject *
builtin___build_class__(PyObject *self, PyObject *args, PyObject *kwds)
{
……
cell = PyEval_EvalCodeEx(PyFunction_GET_CODE(func), PyFunction_GET_GLOBALS(func), ns,
NULL, 0, NULL, 0, NULL, 0, NULL,
PyFunction_GET_CLOSURE(func));
if (cell != NULL) {
PyObject *margs[3] = {name, bases, ns};
cls = _PyObject_FastCallDict(meta, margs, 3, mkw);
……
}
三、從locals到新建立類的tp_dict的轉換
static PyObject *
type_new(PyTypeObject *metatype, PyObject *args, PyObject *kwds)
{
……
/* Check arguments: (name, bases, dict) */
if (!PyArg_ParseTuple(args, "UO!O!:type.__new__", &name, &PyTuple_Type,
&bases, &PyDict_Type, &orig_dict))
……
dict = PyDict_Copy(orig_dict);
if (dict == NULL)
goto error;
……
/* Initialize tp_dict from passed-in dict */
Py_INCREF(dict);
type->tp_dict = dict;
……
}
四、class定義中STORE_NAME虛擬機器指令的執行
這裡可以看到,STORE_NAME虛擬機器指令將解析出的內容儲存在f->f_locals中,對應的就是builtin___build_class__函式中傳入的ns。
Python-3.6.0\Python\ceval.c
PyObject *
_PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
……
TARGET(STORE_NAME) {
PyObject *name = GETITEM(names, oparg);
PyObject *v = POP();
PyObject *ns = f->f_locals;
int err;
if (ns == NULL) {
PyErr_Format(PyExc_SystemError,
"no locals found when storing %R", name);
Py_DECREF(v);
goto error;
}
if (PyDict_CheckExact(ns))
err = PyDict_SetItem(ns, name, v);
else
err = PyObject_SetItem(ns, name, v);
Py_DECREF(v);
if (err != 0)
goto error;
DISPATCH();
}
……
}
五、fry函式對應的MAKE_FUNCTION指令的執行
也就是建立一個PyFunction_Type型別的物件
Python-3.6.0\Python\ceval.c
PyObject *
_PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
……
TARGET(MAKE_FUNCTION) {
PyObject *qualname = POP();
PyObject *codeobj = POP();
PyFunctionObject *func = (PyFunctionObject *)
PyFunction_NewWithQualName(codeobj, f->f_globals, qualname);
……
}
Python-3.6.0\Objects\funcobject.c
PyObject *
PyFunction_NewWithQualName(PyObject *code, PyObject *globals, PyObject *qualname)
{
PyFunctionObject *op;
PyObject *doc, *consts, *module;
static PyObject *__name__ = NULL;
if (__name__ == NULL) {
__name__ = PyUnicode_InternFromString("__name__");
if (__name__ == NULL)
return NULL;
}
op = PyObject_GC_New(PyFunctionObject, &PyFunction_Type);
if (op == NULL)
return NULL;
……
}
六、當通過一個物件呼叫特定介面時
1、簡單的示例程式碼
從程式碼中看,其中是a.xyz()就是通過LOAD_ATTR虛擬機器指令來進行屬性查詢
[email protected]: cat callmethod.py
class A():
def xyz():
pass
a = A()
a.xyz()
[email protected]: ../../Python-3.6.0/python -m dis callmethod.py
1 0 LOAD_BUILD_CLASS
2 LOAD_CONST 0 (<code object A at 0x7f8e748a6040, file "callmethod.py", line 1>)
4 LOAD_CONST 1 ('A')
6 MAKE_FUNCTION 0
8 LOAD_CONST 1 ('A')
10 CALL_FUNCTION 2
12 STORE_NAME 0 (A)
4 14 LOAD_NAME 0 (A)
16 CALL_FUNCTION 0
18 STORE_NAME 1 (a)
5 20 LOAD_NAME 1 (a)
22 LOAD_ATTR 2 (xyz)
24 CALL_FUNCTION 0
26 POP_TOP
28 LOAD_CONST 2 (None)
30 RETURN_VALUE
[email protected]:
2、LOAD_ATTR虛擬機器指令的執行
這裡關鍵是呼叫了_PyType_Lookup函式
/* This is similar to PyObject_GenericGetAttr(),
but uses _PyType_Lookup() instead of just looking in type->tp_dict. */
static PyObject *
type_getattro(PyTypeObject *type, PyObject *name)
{
……
/* No data descriptor found on metatype. Look in tp_dict of this
* type and its bases */
attribute = _PyType_Lookup(type, name);
if (attribute != NULL) {
/* Implement descriptor functionality, if any */
descrgetfunc local_get = Py_TYPE(attribute)->tp_descr_get;
Py_XDECREF(meta_attribute);
if (local_get != NULL) {
/* NULL 2nd argument indicates the descriptor was
* found on the target object itself (or a base) */
return local_get(attribute, (PyObject *)NULL,
(PyObject *)type);
}
Py_INCREF(attribute);
return attribute;
}
……
}
在這個函式中,由於A在a的mro中,所以可以從中找到fry,這個fry是前面看到的PyFunction_Type型別例項。
/* Internal API to look for a name through the MRO.
This returns a borrowed reference, and doesn't set an exception! */
PyObject *
_PyType_Lookup(PyTypeObject *type, PyObject *name)
{
Py_ssize_t i, n;
PyObject *mro, *res, *base, *dict;
unsigned int h;
if (MCACHE_CACHEABLE_NAME(name) &&
PyType_HasFeature(type, Py_TPFLAGS_VALID_VERSION_TAG)) {
/* fast path */
h = MCACHE_HASH_METHOD(type, name);
if (method_cache[h].version == type->tp_version_tag &&
method_cache[h].name == name) {
#if MCACHE_STATS
method_cache_hits++;
#endif
return method_cache[h].value;
}
}
/* Look in tp_dict of types in MRO */
mro = type->tp_mro;
……
n = PyTuple_GET_SIZE(mro);
for (i = 0; i < n; i++) {
base = PyTuple_GET_ITEM(mro, i);
assert(PyType_Check(base));
dict = ((PyTypeObject *)base)->tp_dict;
assert(dict && PyDict_Check(dict));
res = PyDict_GetItem(dict, name);
if (res != NULL)
break;
}
……
}
從_PyType_Lookup返回之後,執行type_getattro函式中的
descrgetfunc local_get = Py_TYPE(attribute)->tp_descr_get;
語句,對於PyFunction_Type,這個就是
/* Bind a function to an object */
static PyObject *
func_descr_get(PyObject *func, PyObject *obj, PyObject *type)
{
if (obj == Py_None || obj == NULL) {
Py_INCREF(func);
return func;
}
return PyMethod_New(func, obj);
}
這裡初始化了im->im_self = self;
PyObject *
PyMethod_New(PyObject *func, PyObject *self)
{
PyMethodObject *im;
if (self == NULL) {
PyErr_BadInternalCall();
return NULL;
}
im = free_list;
if (im != NULL) {
free_list = (PyMethodObject *)(im->im_self);
(void)PyObject_INIT(im, &PyMethod_Type);
numfree--;
}
else {
im = PyObject_GC_New(PyMethodObject, &PyMethod_Type);
if (im == NULL)
return NULL;
}
im->im_weakreflist = NULL;
Py_INCREF(func);
im->im_func = func;
Py_XINCREF(self);
im->im_self = self;
_PyObject_GC_TRACK(im);
return (PyObject *)im;
}
3、執行CALL_FUNCTION指令時
static PyObject *
call_function(PyObject ***pp_stack, Py_ssize_t oparg, PyObject *kwnames)
{
PyObject **pfunc = (*pp_stack) - oparg - 1;
PyObject *func = *pfunc;
PyObject *x, *w;
Py_ssize_t nkwargs = (kwnames == NULL) ? 0 : PyTuple_GET_SIZE(kwnames);
Py_ssize_t nargs = oparg - nkwargs;
PyObject **stack;
/* Always dispatch PyCFunction first, because these are
presumed to be the most frequent callable object.
*/
if (PyCFunction_Check(func)) {
PyThreadState *tstate = PyThreadState_GET();
PCALL(PCALL_CFUNCTION);
stack = (*pp_stack) - nargs - nkwargs;
C_TRACE(x, _PyCFunction_FastCallKeywords(func, stack, nargs, kwnames));
}
else {
if (PyMethod_Check(func) && PyMethod_GET_SELF(func) != NULL) {//PyMethod_New返回的物件滿足這個分支,所以在棧中壓入self,並且遞增nargs的值,這個也就是在類方法中的self引數
/* optimize access to bound methods */
PyObject *self = PyMethod_GET_SELF(func);
PCALL(PCALL_METHOD);
PCALL(PCALL_BOUND_METHOD);
Py_INCREF(self);
func = PyMethod_GET_FUNCTION(func);
Py_INCREF(func);
Py_SETREF(*pfunc, self);
nargs++;
}
4、把PyMethod_Type的建立延遲到LOAD_ATTR執行時的好處
這樣可以隨時獲得一個綁定了物件的函式,例如
[email protected]: cat methodbind.py
class A():
def __init__(self):
self.xx = "xxx"
def show(self):
print(self.xx)
a = A()
f = a.show
f()
[email protected]: ../../Python-3.6.0/python methodbind.py
xxx
[email protected]:
5、module物件的LOAD_ATTR為什麼沒有self
對於module物件,在執行_PyType_Lookup時,它的mro型別只有object和module兩種型別,這兩種型別中都不包含模組內變數資訊,所以找不到descrgetfunc,不會版定self引數。
七、從例子中看
1、直觀的例子
[email protected]: cat methodbind.py
class A():
def __init__(self):
self.xx = "xxx"
def show(self):
print(self.xx)
a = A()
f = a.show
f()
[email protected]: ../../Python-3.6.0/python
Python 3.6.0 (default, Nov 15 2018, 10:32:57)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import methodbind
xxx
>>> print(methodbind.__class__.__mro__)
(<class 'module'>, <class 'object'>)
>>> print(methodbind.a.__class__.__mro__)
(<class 'methodbind.A'>, <class 'object'>)
>>> dir(methodbind.A)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'show']
>>> print(methodbind.__class__)
<class 'module'>
>>> dir(methodbind.__class__)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
>>>
2、解釋
從上面輸出可以看到,作為module的methodbind,它的mro為object和module兩種型別,而這兩種型別中都沒有包含需要查詢的函式,因為函式是在模組的dict中儲存;而對於物件a,它的mro包含了A和object,而show是在A的tp_dict中,所以可以被查詢到。
八、後記
在gdb除錯時,如果編譯的python版本開啟了DEBUG模式,那麼可以通過_PyUnicode_utf8來顯示PyUnicodeObject型別的變數
Breakpoint 2, _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2162
2162 if (ns == NULL) {
(gdb) p _PyUnicode_utf8(name)
$1 = 0x7ffff7fa6f60 "__module__"