為什麼cinder-volume在刪除volume時無反應 (by quqi99)
版權宣告:可以任意轉載,轉載時請務必以超連結形式標明文章原始出處和作者資訊及本版權宣告 (作者:張華 發表於:2018-08-30)
問題
cinder-volumes在刪除volume時很慢, 並且此時create volume的其他操作也無反應, 也無日誌.
理論
green thread的概念就是之間不共享資料, 每個green thread有自己的私有的資料物件, 並且非阻塞, 一個green thread的I/O沒準備就下其他的green thread跟上. 從而實現在一個程序上高效地跑大量非阻塞的greenthread.
所以greenthread的哲學就是不共享資料物件的, 但greenthread也可以通過tpool.Proxy來共享資料物件, 例如: 使用eventlet.pools.Pool機制還構建httplib2.Http例項池在不同greenthread之間作一定程度共享 (見我五年前的一篇部落格 -
import thread
import eventlet
import time
orig = time
from eventlet import tpool
eventlet.monkey_patch()
class MyException(Exception):
pass
class FOO(object):
def foo(self, char, starting_ident):
id=thread.get_ident()
print "native {} exec foo({})".format(char, id)
try:
raise MyException()
finally:
# REPLACE with pass to reproduce failure
# return
pass
def stuff(char):
print "entering green thread"
while True:
print "green exec foo({})".format(char)
f = tpool.Proxy(FOO())
f.foo(char, thread.get_ident())
print "green finished foo({})".format(char)
time.sleep(1)
if __name__ == "__main__":
g = eventlet.greenthread.spawn(stuff, 'A')
g = eventlet.greenthread.spawn(stuff, 'B')
print "done"
while True:
time.sleep(1)
rados.Rados例項來自python-rbd, 它本身又會spawn native thread去連線rados.
每個green thread都例項化一個rados.Rados進而啟動一個native thread的話. 這些native thread之間是通過python直譯器程序來同步資料的, native thread也不是非阻塞的. 所以當一個native thread在執行長任務不yield的話, 其他的green threads都沒有機會執行, 所以此時無法對image做任何操作. 所以有這個patch (https://review.openstack.org/#/c/175555/ ) 引入了tpool.Proxy.
但是這個patch (https://review.openstack.org/#/c/197710/)又將它revert了, 理由是spawn出來的non-block native thread都去匯入python module會造成死鎖(According to Python documentation, code can
lead to a deadlock if the spawned thread directly or indirectly attempts
to import a module. ). 所以在_connect_to_rados這塊又回到之前每個green thread都會例項化一個rados.Rados進而啟動一個native thread的老方式, 但可以通過配置rados_connect_timeout緩解之前的問題( 一個green thread在超時時間內還連不就yield讓其他green thread執行).
但是隻是將_connect_to_rados這塊改回老方式了, 還有其他使用tppool的方式, 如_get_usage_info就是通過tpool的方式:
def RBDProxy(self):
return tpool.Proxy(self.rbd.RBD())
def _get_usage_info(self):
total_provisioned = 0
with RADOSClient(self) as client:
for t in self.RBDProxy().list(client.ioctx):
with RBDVolumeProxy(self, t, read_only=True) as v:
...
class RBDVolumeProxy(object):
def __init__(self, driver, name, pool=None, snapshot=None,
read_only=False, remote=None, timeout=None):
client, ioctx = driver._connect_to_rados(pool, remote, timeout)
if snapshot is not None:
snapshot = utils.convert_str(snapshot)
try:
self.volume = driver.rbd.Image(ioctx,
utils.convert_str(name),
snapshot=snapshot,
read_only=read_only)
self.volume = tpool.Proxy(self.volume)
except driver.rbd.Error:
LOG.exception("error opening rbd image %s", name)
driver._disconnect_from_rados(client, ioctx)
raise
如上程式碼:
- RBDProxy會通過tpool.Proxy採用非阻塞的native thread方式spawn rbd.RBD.
- RBDVolumeProxy也會通過tpool.Proxy採用非阻塞的native thread方式spawn rbd.RBD.
- _get_usage_info會週期性執行, 此時如果刪除一個volume, 那麼RBDVolumeProxy可能就會找不著image, 從而報下面的錯: Image volume-1f3aa3d5-5639-4a68-be07-14f3214320c6 is not found. _get_usage_info /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py
- 要知道 RBDVolumeProxy是一個native thread, 裡面出了ImageNotFound異常又沒有顯示返回就會造成我們上面說的greenthread的問題. 這個green thread阻塞, 其他的green thread還能正常yield, 但其他green thread接著也會遇到這個ImageNotFound異常, 這樣被阻塞的green thread越來越多 (如何驗證了, 可以在上述測試程式碼裡再加一個greenthread裡跑一個死迴圈, 你會發現其他green thread死了不影響這個死迴圈的green thread)
- 解決方法就是最好設定rbd_exclusive_cinder_pool=True避免呼叫上面的_get_usage_info
- 我提了一個bug - https://bugs.launchpad.net/cinder/+bug/1789828
具體日誌如下:
2018-08-29 06:57:41.604 1586622 DEBUG cinder.volume.drivers.rbd [req-f885ea4a-3e98-4aad-bb3f-102240aebca1 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] deleting rbd volume volume-2d5951be-25bf-4313-a706-593664f6cd2e delete_volume /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py:977
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd [req-f885ea4a-3e98-4aad-bb3f-102240aebca1 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] error opening rbd image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2: ImageNotFound: [errno 2] error opening image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2 at snapshot None
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd Traceback (most recent call last):
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py", line 147, in __init__
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd read_only=read_only)
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd File "rbd.pyx", line 1392, in rbd.Image.__init__ (/build/ceph-B2ToPL/ceph-12.2.4/obj-x86_64-linux-gnu/src/pybind/rbd/pyrex/rbd.c:13545)
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd ImageNotFound: [errno 2] error opening image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2 at snapshot None
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd
2018-08-29 06:57:41.612 1586622 DEBUG cinder.volume.drivers.rbd [req-f885ea4a-3e98-4aad-bb3f-102240aebca1 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] Image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2 is not found. _get_usage_info /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py:409
測試
1, 建立100個volume的指令碼
#!/bin/bash -eu
. ~/stsstack-bundles/novarcv3_project
openstack project list| grep " admin "| awk '{print $2}'| xargs -l openstack quota set --volumes 200
TOKEN="`openstack token issue| grep ' id '| awk '{print $4}'`"
c_ep="`curl -s -XGET -H "X-Auth-Token: $TOKEN" "$OS_AUTH_URL/auth/catalog"| jq --raw-output '.catalog[] | select(.name | contains("cinderv3")).endpoints[] | select(.interface | contains("admin")).url'`"
echo "Cinder endpoint is $c_ep"
for i in {0..100}; do
(payload="`cat << EOF | python | sed 's/"/\\"/g'
import json
vol = {"volume": { "size": 1, "name": "vol"}}
print json.dumps(vol)
EOF`"
curl -s -X POST -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" -d "$payload" ${c_ep}/volumes) &
done
2, 刪除這100個volume
openstack volume list| egrep -v "^\+-+|ID"| awk '{print $2}'| xargs openstack volume delete
3, ceph-mon節點上觀察volume的個數
watch -n 1 'rbd -p cinder-ceph ls| wc -l'
4, cinder-volume節點上觀察cinder-volume程序下建立的執行緒數
watch -n 1 'ps -eLf| egrep "[c]inder-volume"| wc -l'