1. 程式人生 > >python記憶體洩露

python記憶體洩露

一、python有自動垃圾回收機制(當物件的引用計數為零時直譯器會自動釋放記憶體),出現記憶體洩露的場景一般是擴充套件庫記憶體洩露或者迴圈引用(還有一種是全域性容器裡的物件沒有刪除)

前者無需討論,後者舉例如下(Obj('B')和Obj('C')的記憶體沒有回收)(貌似迴圈引用的記憶體,Python直譯器也會自己回收(標記-清除垃圾收集機制),只是時間早晚的問題,也就是說我們在編碼中不需要耗費精力去刻意避免迴圈引用,具體的內容這兩天再細看一下(http://stackoverflow.com/questions/4484167/details-how-python-garbage-collection-works

原始碼剖析的垃圾收集那一章還沒看完真是心病啊)---2013.10.20)

[dongsong@localhost python_study]$ cat leak_test2.py 
#encoding=utf-8

class Obj:
    def __init__(self,name='A'):
        self.name = name
        print '%s inited' % self.name
    def __del__(self):
        print '%s deleted' % self.name

if __name__ == '__main__':
    a = Obj('A')
    b = Obj('B')
    c = Obj('c')

    c.attrObj = b
    b.attrObj = c
[dongsong@localhost python_study]$ vpython leak_test2.py 
A inited
B inited
c inited
A deleted

該模組可以找到增長最快的物件、實際最多的物件,可以畫出某物件裡面所有元素的引用關係圖、某物件背後的所有引用關係圖;可以根據地址獲取物件

但是用它來找記憶體洩露還是有點大海撈針的感覺:需要自己更具增長最快、實際最多物件的日誌來確定可疑物件(一般是list/dict/tuple等common物件,這個很難排查;如果最多最快的是自定義的非常規物件則比較好確定原因)

1.show_refs() show_backrefs() show_most_common_types() show_growth()

[dongsong@localhost python_study]$ !cat
cat objgraph1.py 
#encoding=utf-8
import objgraph

if __name__ == '__main__':
        x = []
        y = [x, [x], dict(x=x)]
        objgraph.show_refs([y], filename='/tmp/sample-graph.png') #把[y]裡面所有物件的引用畫出來
        objgraph.show_backrefs([x], filename='/tmp/sample-backref-graph.png') #把對x物件的引用全部畫出來
        #objgraph.show_most_common_types() #所有常用型別物件的統計,資料量太大,意義不大
        objgraph.show_growth(limit=4) #列印從程式開始或者上次show_growth到現在增加的物件(按照增加量的大小排序)
[dongsong@localhost python_study]$ !vpython
vpython objgraph1.py 
Graph written to /tmp/tmpuSFr9A.dot (5 nodes)
Image generated as /tmp/sample-graph.png
Graph written to /tmp/tmpAn6niV.dot (7 nodes)
Image generated as /tmp/sample-backref-graph.png
tuple                          3393     +3393
wrapper_descriptor              945      +945
function                        830      +830
builtin_function_or_method      622      +622

sample-graph.png

sample-backref-graph.png

2.show_chain()

[dongsong@localhost python_study]$ cat objgraph2.py 
#encoding=utf-8
import objgraph, inspect, random

class MyBigFatObject(object):
        pass

def computate_something(_cache = {}):
        _cache[42] = dict(foo=MyBigFatObject(),bar=MyBigFatObject())
        x = MyBigFatObject()

if __name__ == '__main__':
        objgraph.show_growth(limit=3)
        computate_something()
        objgraph.show_growth(limit=3)
        objgraph.show_chain(
                objgraph.find_backref_chain(random.choice(objgraph.by_type('MyBigFatObject')),
                        inspect.ismodule),
                filename = '/tmp/chain.png')
        #roots = objgraph.get_leaking_objects()
        #print 'len(roots)=%d' % len(roots)
        #objgraph.show_most_common_types(objects = roots)
        #objgraph.show_refs(roots[:3], refcounts=True, filename='/tmp/roots.png')
[dongsong@localhost python_study]$ !vpython
vpython objgraph2.py 
tuple                  3400     +3400
wrapper_descriptor      945      +945
function                831      +831
wrapper_descriptor      956       +11
tuple                  3406        +6
member_descriptor       165        +4
Graph written to /tmp/tmpklkHqC.dot (7 nodes)
Image generated as /tmp/chain.png

chain.png


三、gc模組

該模組可以確定垃圾回收期無法引用到(unreachable)和無法釋放(uncollectable)的物件,跟objgraph相比有其獨到之處

gc.collect()強制回收垃圾,返回unreachable object的數量

gc.garbage返回unreachable object中uncollectable object的列表(都是些有__del__()解構函式並且身陷引用迴圈的物件)IfDEBUG_SAVEALL is set, then all unreachable objects will be added to this list rather than freed.

warning:如果用gc.disable()把自動垃圾回收關掉了,然後又不主動gc.collect(),你會看到記憶體刷刷的被消耗....

[dongsong@bogon python_study]$ cat gc_test.py 
#encoding=utf-8

import gc

class MyObj:
        def __init__(self, name):
                self.name = name
                print "%s inited" % self.name
        def __del__(self):
                print "%s deleted" % self.name


if __name__ == '__main__':
        gc.disable()
        gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS | gc.DEBUG_SAVEALL)

        a = MyObj('a')
        b = MyObj('b')
        c = MyObj('c')
        a.attr = b
        b.attr = a
        a = None
        b = None
        c = None

        if gc.isenabled():
                print 'automatic collection is enabled'
        else:
                print 'automatic collection is disabled'

        rt = gc.collect()
        print "%d unreachable" % rt

        garbages = gc.garbage
        print "\n%d garbages:" % len(garbages)
        for garbage in garbages:
                if isinstance(garbage, MyObj):
                        print "obj-->%s name-->%s attrrMyObj-->%s" % (garbage, garbage.name, garbage.attr)
                else:
                        print str(garbage)


[dongsong@bogon python_study]$ vpython gc_test.py 
a inited
b inited
c inited
c deleted
automatic collection is disabled
gc: uncollectable <MyObj instance at 0x7f3ebd455b48>
gc: uncollectable <MyObj instance at 0x7f3ebd455b90>
gc: uncollectable <dict 0x261c4b0>
gc: uncollectable <dict 0x261bdf0>
4 unreachable

4 garbages:
obj--><__main__.MyObj instance at 0x7f3ebd455b48> name-->a attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b90>
obj--><__main__.MyObj instance at 0x7f3ebd455b90> name-->b attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b48>
{'name': 'a', 'attr': <__main__.MyObj instance at 0x7f3ebd455b90>}
{'name': 'b', 'attr': <__main__.MyObj instance at 0x7f3ebd455b48>}

四、pdb模組

命令和gdb差不錯(只是列印資料的時候不是必須加個p,而且除錯介面和操作類似python互動模式)

h(elp) 幫助

c(ontinue)  繼續

n(ext) 下一個語句

s(tep)  下一步(跟進函式內部)

b(reak) 設定斷點

l(ist) 顯示程式碼

bt 呼叫棧

回車 重複上一個命令

....

鳥人喜歡在需要除錯的地方加入pdb.set_trace()然後進入狀態....(其他還有好多方式備選)

五、django記憶體洩露

Django isn't known to leak memory. If you find your Django processes areallocating more and more memory, with no sign of releasing it, check to makesure yourDEBUG setting is set toFalse. IfDEBUGisTrue, then Django saves a copy of every SQL statement it has executed.

To fix the problem, set DEBUG toFalse.

If you need to clear the query list manually at any point in your functions,just callreset_queries(), like this:

from django import db
db.reset_queries()