1. 程式人生 > >關於metaclass,我原以為我是懂的

關於metaclass,我原以為我是懂的

tis .class document eva ack set 體系 string 開發

  關於Python2.x中metaclass這一黑科技,我原以為我是懂的,只有當被打臉的時候,我才認識到自己too young too simple sometimes native。

  為什麽之前我認為自己懂了呢,因為我閱讀過stackoverflow上的《what-is-a-metaclass-in-python》這一神作(註意,本文中專指e-satis的回答),在伯樂在線上也有不錯的翻譯《深刻理解Python中的元類(metaclass)》。而且在實際項目中也使用過metaclass,比如creating-a-singleton-in-python一文中提到的用metaclass創建單例,比如用metaclass實現mixin效果,當然,正是後面這個使用案列讓我重新認識metaclass。

  本文地址:http://www.cnblogs.com/xybaby/p/7927407.html

要點回顧

  不得不承認《what-is-a-metaclass-in-python》真的是非常棒,仔細閱讀完這篇文章,基本上就搞清了metaclass。因此在這裏,只是強調一些要點,強烈建議還沒閱讀過原文的pythoner去閱讀一下。

第一:everything is object

  python中,一切都是對象,比如一個數字、一個字符串、一個函數。對象是類(class)的是實例,類(class)也是對象,是type的實例。type對象本身又是type類的實例(雞生蛋還是蛋生雞?),因此我們稱type為metaclass(中文元類)。在《python源碼剖析》中,有清晰的表示

技術分享圖片

  在python中,可以通過對象的__class__屬性來查看對應的類,也可以通過isinstance來判斷一個對象是不是某一個類的實例。for example:

>>> class OBJ(object):

... a = 1
...
>>> o = OBJ()
>>> o.__class__
<class ‘__main__.OBJ‘>
>>> isinstance(o, OBJ)
True
>>> OBJ.__class__
<type ‘type‘>
>>> isinstance(OBJ, type)


True
>>> type.__class__
<type ‘type‘>
>>>

第二:metaclass可以定制類的創建

  我們都是通過class OBJ(obejct):pass的方式來創建一個類,上面有提到,類(class)是type類型的實例,按照我們常見的創建類的實例(instance)的方法,那麽類(class)應該就是用 class="type"(*args)的方式創建的。確實如此,python document中有明確描述:

class type(name, bases, dict)
With three arguments, return a new type object. This is essentially a dynamic form of the class statement. The name string is the class name and becomes the __name__ attribute; the bases tuple itemizes the base classes and becomes the __bases__ attribute; and the dict dictionary is the namespace containing definitions for class body and becomes the __dict__ attribute. For example, the following two statements create identical type objects:

  該函數返回的就是一個class,三個參數分別是類名、基類列表、類的屬性。比如在上面提到的OBJ類,完全等價於:OBJ = type(‘OBJ‘, (), {‘a‘: 1})

  當然,用上面的方式創建一個類(class)看起來很傻,不過其好處在於可以動態的創建一個類。

  python將定制類開放給了開發者,type也是一個類型,那麽自然可以被繼承,type的子類替代了Python默認的創建類(class)的行為,什麽時候需要做呢

  Some ideas that have been explored including logging, interface checking, automatic delegation, automatic property creation, proxies, frameworks, and automatic resource locking/synchronization.

  那麽當我們用class OBJ(obejct):pass的形式聲明一個類的時候,怎麽指定OBJ的創建行為呢,那就是在類中使用__metaclass__。最簡單的例子:

 1 class Metaclass(type):
 2     def __new__(cls, name, bases, dct):
 3         print HAHAHA
 4         dct[a] = 1
 5         return type.__new__(cls, name, bases, dct)
 6 
 7 print before Create OBJ
 8 class OBJ(object):
 9     __metaclass__ = Metaclass
10 print after Create OBJ
11 
12 if __name__ == __main__:
13     print OBJ.a

  運行結果:

before Create OBJ
HAHAHA
after Create OBJ
1

  可以看到在代碼執行的時候,在創建OBJ這個類的時候,__metaclass__起了作用,為OBJ增加了一個類屬性‘a‘

第三:關於__metaclass__的兩個細節

  首先,__metaclass__是一個callable即可,不一定非得是一個類,在what-is-a-metaclass-in-python就有__metaclass__是function的實例,也解釋了為什麽__metaclass__為一個類是更好的選擇。

  其次,就是如何查找並應用__metaclass__,這哥在what-is-a-metaclass-in-python沒用詳細介紹,但是在python document中是有的:

The appropriate metaclass is determined by the following precedence rules:
  ● If dict[‘__metaclass__‘] exists, it is used.
  ● Otherwise, if there is at least one base class, its metaclass is used (this looks for a __class__ attribute first and if not found, uses its type).
  ● Otherwise, if a global variable named __metaclass__ exists, it is used.
  ● Otherwise, the old-style, classic metaclass (types.ClassType) is used.

  即:

    先從類的dict中查找,否則從基類的dict查找(這裏會有一些需要註意的細節,後文會提到),否則從global作用域查找,否則使用默認的創建方式

  對應的python源碼在ceval.c::build_class,核心代碼如下,很明了。

技術分享圖片
 1 static PyObject *
 2 build_class(PyObject *methods, PyObject *bases, PyObject *name)
 3 {
 4     PyObject *metaclass = NULL, *result, *base;
 5 
 6     if (PyDict_Check(methods))
 7         metaclass = PyDict_GetItemString(methods, "__metaclass__");
 8     if (metaclass != NULL)
 9         Py_INCREF(metaclass);
10     else if (PyTuple_Check(bases) && PyTuple_GET_SIZE(bases) > 0) {
11         base = PyTuple_GET_ITEM(bases, 0);
12         metaclass = PyObject_GetAttrString(base, "__class__");
13         if (metaclass == NULL) {
14             PyErr_Clear();
15             metaclass = (PyObject *)base->ob_type;
16             Py_INCREF(metaclass);
17         }
18     }
19     else {
20         PyObject *g = PyEval_GetGlobals();
21         if (g != NULL && PyDict_Check(g))
22             metaclass = PyDict_GetItemString(g, "__metaclass__");
23         if (metaclass == NULL)
24             metaclass = (PyObject *) &PyClass_Type;
25         Py_INCREF(metaclass);
26     }
27     result = PyObject_CallFunctionObjArgs(metaclass, name, bases, methods,
28                                           NULL);
29     Py_DECREF(metaclass);
30     if (result == NULL && PyErr_ExceptionMatches(PyExc_TypeError)) {
31         /* A type error here likely means that the user passed
32            in a base that was not a class (such the random module
33            instead of the random.random type).  Help them out with
34            by augmenting the error message with more information.*/
35 
36         PyObject *ptype, *pvalue, *ptraceback;
37 
38         PyErr_Fetch(&ptype, &pvalue, &ptraceback);
39         if (PyString_Check(pvalue)) {
40             PyObject *newmsg;
41             newmsg = PyString_FromFormat(
42                 "Error when calling the metaclass bases\n"
43                 "    %s",
44                 PyString_AS_STRING(pvalue));
45             if (newmsg != NULL) {
46                 Py_DECREF(pvalue);
47                 pvalue = newmsg;
48             }
49         }
50         PyErr_Restore(ptype, pvalue, ptraceback);
51     }
52     return result;
53 }
ceval::build_class

我遇到的問題

  在項目中,我們使用了metaclass來實現Mixin的行為,即某一個類擁有定義在其他一些類中的行為,簡單來說,就是要把其他類的函數都註入到這個類。但是我們不想用繼承的方法,一來,語義上不是is a的關系,不使用繼承;二來,python的mro也不是很東西。我們是這麽幹的,偽碼如下:

 1 import inspect
 2 import types
 3 class RunImp(object):
 4     def run(self):
 5         print just run
 6 
 7 class FlyImp(object):
 8     def fly(self):
 9         print just fly
10 
11 class MetaMixin(type):
12     def __init__(cls, name, bases, dic):
13         super(MetaMixin, cls).__init__(name, bases, dic)
14         member_list = (RunImp, FlyImp)
15 
16         for imp_member in member_list:
17             if not imp_member:
18                 continue
19             
20             for method_name, fun in inspect.getmembers(imp_member, inspect.ismethod):
21                 setattr(cls, method_name, fun.im_func)
22 
23 class Bird(object):
24     __metaclass__ = MetaMixin
25 
26 print Bird.__dict__
27 print Bird.__base__

  運行結果如下:

{‘fly‘: <function fly at 0x025220F0>, ‘__module__‘: ‘__main__‘, ‘__metaclass__‘: <class ‘__main__.MetaMixin‘>, ‘__dict__‘: <attribute ‘__dict__‘ of ‘Bird‘ objects>, ‘run‘: <function run at 0x025220B0>, ‘__weakref__‘: <attribute ‘__weakref__‘ of ‘Bird‘ objects>, ‘__doc__‘: None}
<type ‘object‘>

  可以看到,通過metaclass,Bird擁有了run fly兩個method。但是類的繼承體系沒有收到影響。

重載通過MetaMixin中註入的方法

  某一日需求變化,需要繼承自Brid,定義特殊的Bird,重載run方法,新增代碼如下;

 1 class Bird(object):
 2     __metaclass__ = MetaMixin
 3 
 4 class SpecialBird(Bird):
 5     def run(self):
 6         print SpecialBird run
 7 
 8 if __name__ == __main__:
 9     b = SpecialBird()
10     b.run()

  運行結果:

just run

  what?!,重載根本不生效。這似乎顛覆了我的認知:Bird類有一個run屬性,子類SpecialBird重載了這個方法,那麽就應該調用子類的方法啊。

  什麽原因呢,答案就在上面提到的__metaclass__查找順序,因為SpecialBird自身沒有定義__metaclass__屬性,那麽會使用基類Bird的__metaclass__屬性,因此雖然在SpecialBird中定義了run方法,但是會被MetaMixin給覆蓋掉。使用dis驗證如下

1 import dis
2 
3 class SpecialBird(Bird):
4     def run(self):
5         print SpecialBird run
6     dis.dis(run)
7 dis.dis(SpecialBird.run)

  技術分享圖片

  可以看到在SpecialBird.run方法本來是類中顯示定義的方法,後來被MetaMixin所覆蓋了。

防止屬性被意外覆蓋

  這就暴露出了一個問題,當前版本的MetaMixin可能導致屬性的覆蓋問題。比如在RunImp、FlyImp有同名的函數foo時,在創建好的Bird類中,其foo方法來自於FlyImp,而不是RunImp。通用,即使在Bird內部也定義foo方法,也會被FlyImp.foo覆蓋。

  這顯然不是我們所期望的結果,這也是python的陷阱:沒有報錯,但是以錯誤的方式運行。我們要做的就是盡早把這個錯誤爆出來。實現很簡單,只需要簡單修改MetaMixin,見高亮標示。

 9 class MetaMixin(type):
10     def __init__(cls, name, bases, dic):
11         super(MetaMixin, cls).__init__(name, bases, dic)
12         member_list = (RunImp, FlyImp)
13 
14         for imp_member in member_list:
15             if not imp_member:
16                 continue
17             
18             for method_name, fun in inspect.getmembers(imp_member, inspect.ismethod):
19                 assert not hasattr(cls, method_name), method_name
20                 setattr(cls, method_name, fun.im_func)

  當我們修改MetaMixin之後,再次運行下面的代碼的時候就報錯了

技術分享圖片
class RunImp(object):
    def run(self):
        print just run

class FlyImp(object):
    def fly(self):
        print just fly

class MetaMixin(type):
    def __init__(cls, name, bases, dic):
        super(MetaMixin, cls).__init__(name, bases, dic)
        member_list = (RunImp, FlyImp)

        for imp_member in member_list:
            if not imp_member:
                continue
            
            for method_name, fun in inspect.getmembers(imp_member, inspect.ismethod):
                assert not hasattr(cls, method_name), method_name
                setattr(cls, method_name, fun.im_func)

class Bird(object):
    __metaclass__ = MetaMixin

class SpecialBird(Bird):
    pass
會報錯的完整代碼

  運行結果拋了異常

Traceback (most recent call last):
assert not hasattr(cls, method_name), method_name
AssertionError: run

  呃,代碼總共就幾行,只有一個run方法啊,怎麽會報錯說有重復的方法呢,在MetaMixin中加一點log

技術分享圖片
 1 class RunImp(object):
 2     def run(self):
 3         print just run
 4 
 5 class FlyImp(object):
 6     def fly(self):
 7         print just fly
 8 
 9 class MetaMixin(type):
10     def __init__(cls, name, bases, dic):
11         super(MetaMixin, cls).__init__(name, bases, dic)
12         member_list = (RunImp, FlyImp)
13 
14         for imp_member in member_list:
15             if not imp_member:
16                 continue
17             
18             for method_name, fun in inspect.getmembers(imp_member, inspect.ismethod):
19                 print(class %s get method %s from %s % (name, method_name, imp_member))
20                 # assert not hasattr(cls, method_name), method_name
21                 setattr(cls, method_name, fun.im_func)
22 
23 class Bird(object):
24     __metaclass__ = MetaMixin
25 
26 class SpecialBird(Bird):
27     pass
加了log且不報錯的代碼

  運行結果:

class Bird get method run from <class ‘__main__.RunImp‘>
class Bird get method fly from <class ‘__main__.FlyImp‘>
class SpecialBird get method run from <class ‘__main__.RunImp‘>
class SpecialBird get method fly from <class ‘__main__.FlyImp‘>

  一目了然,原來在創建Bird的時候已經將run、fly方法註入到了bird.__dict__, SpecialBird繼承子Bird,那麽在Speialbird使用__metaclass__定制化之前,SpecialBird已經有了run、fly屬性,然後再度運用metaclass的時候就檢查失敗了。

  簡而言之,這個是一個很隱蔽的陷阱如果基類定義了__metaclass__,那麽子類在創建的時候會再次調用metaclass,然而理論上來說可能是沒有必要的,甚至會有副作用

解決重復使用metaclass

  首先,既然我們知道首先在子類的dict中查找__metaclass__,找不到再考慮基類,那麽我們子類(SpecialBird)中重新生命一個__metaclass__就好了,如下所示:

1 class DummyMetaIMixin(type):
2     pass
3 
4 class SpecialBird(Bird):
5     __metaclass__ = DummyMetaIMixin

  很遺憾,拋出了一個我之前從未見過的異常

TypeError: Error when calling the metaclass bases
  metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

  意思很明顯,子類的__metaclass__必須繼承自基類的__metaclass__,那麽再改改

1 class DummyMetaIMixin(MetaMixin):
2     def __init__(cls, name, bases, dic):
3         type.__init__(cls, name, bases, dic)
4 
5 class SpecialBird(Bird):
6     __metaclass__ = DummyMetaIMixin

  This‘s OK!完整代碼如下:

技術分享圖片
 1 class RunImp(object):
 2     def run(self):
 3         print just run
 4 
 5 class FlyImp(object):
 6     def fly(self):
 7         print just fly
 8 
 9 class MetaMixin(type):
10     def __init__(cls, name, bases, dic):
11         super(MetaMixin, cls).__init__(name, bases, dic)
12         member_list = (RunImp, FlyImp)
13 
14         for imp_member in member_list:
15             if not imp_member:
16                 continue
17             
18             for method_name, fun in inspect.getmembers(imp_member, inspect.ismethod):
19                 print(class %s get method %s from %s % (name, method_name, imp_member))
20                 assert not hasattr(cls, method_name), method_name
21                 setattr(cls, method_name, fun.im_func)
22 
23 class Bird(object):
24     __metaclass__ = MetaMixin
25 
26 
27 class DummyMetaIMixin(MetaMixin):
28     def __init__(cls, name, bases, dic):
29         type.__init__(cls, name, bases, dic)
30 
31 class SpecialBird(Bird):
32     __metaclass__ = DummyMetaIMixin
解決了子類重新使用metaclass的問題

metaclass __new__ __init__

  行文至此,使用過metaclass的pythoner可能會有疑問,因為網上的很多case都是在metaclass中重載type的__new__方法,而不是__init__。實時上,對於我們使用了MetaMixin,也可以通過重載__new__方法實現,而且還有意外的驚喜!

 1 class RunImp(object):
 2     def run(self):
 3         print just run
 4 
 5 class FlyImp(object):
 6     def fly(self):
 7         print just fly
 8 
 9 class MetaMixinEx(type):
10     def __new__(cls, name, bases, dic):
11         member_list = (RunImp, FlyImp)
12 
13         for imp_member in member_list:
14             if not imp_member:
15                 continue
16             
17             for method_name, fun in inspect.getmembers(imp_member, inspect.ismethod):
18                 print(class %s get method %s from %s % (name, method_name, imp_member))
19                 assert method_name not in dic, (imp_member, method_name)
20                 dic[method_name] = fun.im_func
21         return type.__new__(cls, name, bases, dic)
22 
23 class Bird(object):
24     __metaclass__ = MetaMixinEx
25 
26 class SpecialBird(Bird):
27     pass

  運行結果

class Bird get method run from <class ‘__main__.RunImp‘>
class Bird get method fly from <class ‘__main__.FlyImp‘>
class SpecialBird get method run from <class ‘__main__.RunImp‘>
class SpecialBird get method fly from <class ‘__main__.FlyImp‘>

  從結果可以看到,雖然子類也重復運行了一遍metaclass, 但並沒有報錯!註意代碼第18行是有assert的!為什麽呢,本質是因為__new__和__init__兩個magic method的區別

  絕大多數Python程序員都寫過__init__方法,但很少有人寫__new__方法,因為絕大多數時候,我們都無需重載__new__方法。python document也說了,哪些場景需要重載__new__方法呢

__new__() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It is also commonly overridden in custom metaclasses in order to customize class creation.

  即用於繼承不可變對象,或者使用在metaclass中!

  那麽__new__和__init__有什麽卻別呢

__new__:
  Called to create a new instance of class cls
__init__:
  Called when the instance is created.

  即__new__用於如何創建實例,而__init__是在實例已經創建好之後調用 

  註意,僅僅當__new__返回cls的實例時,才會調用__init__方法,__init__方法的參數同__new__方法。看下面的例子

 1 class OBJ(object):
 2     def __new__(self, a):
 3         
 4         ins = object.__new__(OBJ, a)
 5         print "call OBJ new with parameter %s, created inst %s" %  (a, ins)
 6         return ins # 去掉這行就不會再調用__init__
 7 
 8     def __init__(self, a):
 9         print "call OBJ new with parameter %s, inst %s" %  (a, self)
10 
11 if __name__ == __main__:
12     OBJ(123)

call OBJ new with parameter 123, created inst <__main__.OBJ object at 0x024C2470>
call OBJ new with parameter 123, inst <__main__.OBJ object at 0x024C2470>

  可以看到,__init__中的self正是__new__中創建並返回的ins,正如第6行的註釋所示,如果去掉第6行(即不返回ins), 那麽是不會調用__init__方法的。

  metaclass繼承自type,那麽其__new__、__init__和普通class的__new__、__init__是一樣的,只不過metaclass的__new__返回的是一個類。我們看看metaclass的例子

 1 class Meta(type):
 2     def __new__(cls, name, bases, dic):
 3         print here class is %s % cls
 4         print class %s will be create with bases class %s and attrs %s % (name, bases, dic.keys())
 5         dic[what] = name
 6         return type.__new__(cls, name, bases, dic)
 7 
 8     def __init__(cls, name, bases, dic):
 9         print here class is %s % cls
10         print class %s will be inited with bases class %s and attrs %s % (name, bases, dic.keys())
11         print cls.what
12         super(Meta, cls).__init__(name, bases, dic)
13 
14 class OBJ(object):
15     __metaclass__ = Meta
16     attr = 1
17 
18 print(-----------------------------------------------)
19 class SubObj(OBJ):
20     pass

  輸出結果:

here class is <class ‘__main__.Meta‘>

class OBJ will be create with bases class (<type ‘object‘>,) and attrs [‘__module__‘, ‘__metaclass__‘, ‘attr‘]
here class is <class ‘__main__.OBJ‘>
class OBJ will be inited with bases class (<type ‘object‘>,) and attrs [‘__module__‘, ‘__metaclass__‘, ‘attr‘, ‘what‘]
OBJ
-----------------------------------------------
here class is <class ‘__main__.Meta‘>
class SubObj will be create with bases class (<class ‘__main__.OBJ‘>,) and attrs [‘__module__‘]
here class is <class ‘__main__.SubObj‘>
class SubObj will be inited with bases class (<class ‘__main__.OBJ‘>,) and attrs [‘__module__‘, ‘what‘]
SubObj

  註意分割線。

  首先要註意雖然在new init方法的第一個參數都是cls,但是完全是兩回事!

  然後在調用new之後,產生的類對象(cls如OBJ)就已經有了動態添加的what 屬性

  在調用__new__的時候,dic只來自類的scope內所定義的屬性,所以在創建SubObj的時候,dic裏面是沒有屬性的,attr在基類OBJ的dict裏面,也能看出在__new__中修改後的dic被傳入到__init__方法當中。

references

what-is-a-metaclass-in-python

深刻理解Python中的元類(metaclass)

customizing-class-creation

special-method-names

關於metaclass,我原以為我是懂的