深度分析gevent執行流程
一直對gevent執行流程比較模糊,最近看原始碼略有所得,不敢獨享,故分享之。
gevent是一個高效能網路庫,底層是libevent,1.0版本之後是libev,核心是greenlet。gevent和eventlet是親近,唯一不同的是eventlet是自己實現的事件驅動,而gevent是使用libev。兩者都有廣泛的應用,如OpenStack底層網路通訊使用eventlet,goagent是使用gevent。
要想理解gevent首先要理解gevent的排程流程,gevent中有一個hub的概念,也就是下圖的MainThread,用於排程所有其它的greenlet例項(下圖Coroutine)。
也許大家會好奇,為什麼採用這種模式,為什麼每次都要切換到hub?我想理由有二:
1.hub是事件驅動的核心,每次切換到hub後將繼續迴圈事件。如果在一個greenlet中不出來,那麼其它greenlet將得不到呼叫。
2.維持兩者關係肯定比維持多個關係簡單。每次我們所關心的就是hub以及當前greenlet,不需要考慮各個greenlet之間關係。
我們看看最簡單的gevent.sleep發生了什麼?
我們先想想最簡單的sleep(0)該如何排程?根據上面很明顯
1.向事件迴圈註冊當前greenlet的switch函式
2.切換到hub,執行主事件迴圈
[python] view plain copy
- def sleep(seconds=0, ref=True):
- hub = get_hub()
- loop = hub.loop
- if seconds <= 0:
- waiter = Waiter()
- loop.run_callback(waiter.switch)
- waiter.get()
- else:
- hub.wait(loop.timer(seconds, ref=ref))
當切換到hub後當呼叫剛註冊的回撥(waiter.switch)回到剛剛sleep所在的greenlet。
不熟悉Waiter的童鞋可能對上面說的有點模糊,下面我們好好看看Waiter是什麼。
[python] view plain copy
- >>> result = Waiter()
- >>> timer = get_hub().loop.timer(0.1)
- >>> timer.start(result.switch, 'hello from Waiter')
- >>> result.get() # blocks for 0.1 seconds
- 'hello from Waiter'
[python] view plain copy
- def get(self):
- assert self.greenlet is None, 'This Waiter is already used by %r' % (self.greenlet, )
- self.greenlet = getcurrent()
- try:
- return self.hub.switch()
- finally:
- self.greenlet = None
將把self.greenlet設定為當前greenlet,然後通過self.hub.switch()切換到主迴圈,很明顯在主迴圈中將回調result.switch,看程式碼:
[python] view plain copy
- def switch(self, value=None):
- """Switch to the greenlet if one's available. Otherwise store the value."""
- greenlet = self.greenlet
- assert getcurrent() is self.hub, "Can only use Waiter.switch method from the Hub greenlet"
- switch = greenlet.switch
- try:
- switch(value)
- except:
- self.hub.handle_error(switch, *sys.exc_info())
通過以上分析,小夥伴們肯定都懂了gevent的執行流程了。
這裡有個問題,如果上面先發生result.switch,那又該如何呢?就像下面這樣:
[python] view plain copy
- >>> result = Waiter()
- >>> timer = get_hub().loop.timer(0.1)
- >>> timer.start(result.switch, 'hi from Waiter')
- >>> sleep(0.2)
- >>> result.get() # returns immediatelly without blocking
- 'hi from Waiter'
既然我們知道了gevent執行流程,下面我們看看gevent.spawn和join到底做了什麼?
gevent.spawn其實就是Greenlet.spawn,所以gevent.spawn就是建立一個greenlet,並將該greenlet的switch()加入hub主迴圈回撥。
[python] view plain copy
- class Greenlet(greenlet):
- """A light-weight cooperatively-scheduled execution unit."""
- def __init__(self, run=None, *args, **kwargs):
- hub = get_hub()
- greenlet.__init__(self, parent=hub)
- if run is not None:
- self._run = run
- self._start_event = None
- def start(self):
- """Schedule the greenlet to run in this loop iteration"""
- if self._start_event is None:
- self._start_event = self.parent.loop.run_callback(self.switch)
- @classmethod
- def spawn(cls, *args, **kwargs):
- """Return a new :class:`Greenlet` object, scheduled to start.
- The arguments are passed to :meth:`Greenlet.__init__`.
- """
- g = cls(*args, **kwargs)
- g.start()
- return g
通過下面程式碼證明:
[python] view plain copy
- import gevent
- def talk(msg):
- print(msg)
- g1 = gevent.spawn(talk, 'bar')
- gevent.sleep(0)
將輸出:bar,我們通過sleep切換到hub,然後hub將執行我們新增的回撥talk,一切正常。
此時不要沾沾自喜,如果下面程式碼也覺得一切正常再高興也不遲。
[python] view plain copy
- import gevent
- def talk(msg):
- print(msg)
- gevent.sleep(0)
- print msg
- g1 = gevent.spawn(talk, 'bar')
- gevent.sleep(0)
這次還是輸出:bar,有點不對勁啊,應該輸出兩個bar才對,為什麼為導致這樣呢?
我們來好好分析流程:
1.gevent.spawn註冊回撥talk
2.然後最後一行gevent.sleep(0)註冊當前greenlet.switch(最外面的)到hub,然後切換到hub
3.hub執行回撥talk,列印"bar",此時gevent.sleep再次將g1.switch註冊到hub,同時切換到hub
4.由於第2步最外層greenlet現註冊,所以將呼叫最外層greenlet,此時很明顯,程式將結束。因為最外層greenlet並不是hub的子greenlet,
所以died後並不會回到父greenlet,即hub
你可能會說那我自己手動切換到hub不就可以了嗎?這將導致主迴圈結束不了的問題。
[python] view plain copy
- import gevent
- def talk(msg):
- print(msg)
- gevent.sleep(0)
- print msg
- g1 = gevent.spawn(talk, 'bar')
- gevent.get_hub().switch()
[python] view plain copy
- bar
- bar
- Traceback (most recent call last):
- File "F:\py_cgi\geve.py", line 9, in <module>
- gevent.get_hub().switch()
- File "C:\Python26\lib\site-packages\gevent\hub.py", line 331, in switch
- return greenlet.switch(self)
- gevent.hub.LoopExit: This operation would block forever
這也就是join存在的價值,我們看看join是如何做到的?
[python] view plain copy
- def join(self, timeout=None):
- """Wait until the greenlet finishes or *timeout* expires.
- Return ``None`` regardless.
- """
- if self.ready():
- return
- else:
- switch = getcurrent().switch
- self.rawlink(switch)
- try:
- t = Timeout.start_new(timeout)
- try:
- result = self.parent.switch()
- assert result is self, 'Invalid switch into Greenlet.join(): %r' % (result, )
- finally:
- t.cancel()