flask原始碼剖析之工作流程
一直在用flask寫一些小型的後臺服務。有餘力之下,去研究了一下flask的原始碼。不得不讚嘆flask對於python的運用之爐火純青。下面開始分析。
flask框架使用了庫werkzeug,werkzeug是基於WSGI的,WSGI是什麼呢?(Web Server Gateway Interface),WSGI是一個協議,通俗翻譯Web伺服器閘道器介面,該協議把接收自客戶端的所有請求都轉交給這個物件處理。
基於如上背景知識,這裡不只是研究了flask的原始碼,也會把涉及到的werkzeug庫的原始碼,以及werkzeug庫使用到的python基本庫的原始碼列出來,力求能夠把整個邏輯給陳列清楚。
服務類
Flask:
應用程式類,該類直接面向用戶。所有的Flask程式都必須建立這樣一個程式例項。簡單點兒可以app=Flask(__name__),Flask可以利用這個引數決定程式的根目錄。
* 類成員:
Request
Response
* 類方法:
run
run方法呼叫werkzeug庫中的一個run_simple函式來啟動BaseWSGIServer。
def run_simple(hostname, port, application, use_reloader=False, use_debugger=False, use_evalex=True, extra_files=None, reloader_interval=1, reloader_type='auto', threaded=False, processes=1, request_handler=None, static_files=None, passthrough_errors=False, ssl_context=None): if use_debugger: from werkzeug.debug import DebuggedApplication application = DebuggedApplication(application, use_evalex) if static_files: from werkzeug.wsgi import SharedDataMiddleware application = SharedDataMiddleware(application, static_files) def inner(): try: fd = int(os.environ['WERKZEUG_SERVER_FD']) except (LookupError, ValueError): fd = None make_server(hostname, port, application, threaded, processes, request_handler, passthrough_errors, ssl_context, fd=fd).serve_forever() if use_reloader: # If we're not running already in the subprocess that is the # reloader we want to open up a socket early to make sure the # port is actually available. if os.environ.get('WERKZEUG_RUN_MAIN') != 'true': if port == 0 and not can_open_by_fd: raise ValueError('Cannot bind to a random port with enabled ' 'reloader if the Python interpreter does ' 'not support socket opening by fd.') # Create and destroy a socket so that any exceptions are # raised before we spawn a separate Python interpreter and # lose this ability. address_family = select_ip_version(hostname, port) s = socket.socket(address_family, socket.SOCK_STREAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) s.bind((hostname, port)) if hasattr(s, 'set_inheritable'): s.set_inheritable(True) # If we can open the socket by file descriptor, then we can just # reuse this one and our socket will survive the restarts. if can_open_by_fd: os.environ['WERKZEUG_SERVER_FD'] = str(s.fileno()) s.listen(LISTEN_QUEUE) else: s.close() from ._reloader import run_with_reloader run_with_reloader(inner, extra_files, reloader_interval, reloader_type) else: inner()
make_server建立http伺服器,server_forever啟動服務監聽埠。使用select非同步處理。
BaseWSGIServer:
WSGI伺服器,繼承自HTTPServer。
* 類成員:
app;初始化的時候將Flask的例項傳入。
* 類方法:
__init__ ;初始化完成後啟動HTTPServer,並將請求處理類WSGIRequestHandler傳入。
HTTPServer
HTTP伺服器。python基礎庫(BaseHTTPServer.py)。該類實現了一個server_bind方法用來繫結伺服器地址和埠,可不必理會。繼承自SocketServer.TCPServer。可以看到,HTTP實際上是基於TCP服務的。
SocketServer.TCPServer
繼承自BaseServer。實現了一個fileno的函式,用來獲取socket套接字的描述符。這一個是預留給select模組使用的。
請求處理類
BaseRequestHandler
最基礎的請求處理類。類成員包括server(HTTPServer)和request。然後在初始化的時候,依次呼叫setup、handle、finish方法。
SocketServer.StreamRequestHandler
繼承自BaseRequestHandler。重寫setup和finish方法。setup主要是建立讀和寫兩個描述符,finish用來關閉連線。關鍵的是handle方法。
BaseHTTPRequestHandler
繼承自SocketServer.StreamRequestHandler。利用這個類基本上可以處理簡單的http請求了。但Flask只使用了它的幾個方法。最重要的是handle方法。
def handle(self):
"""Handle multiple requests if necessary."""
self.close_connection = 1
self.handle_one_request()
while not self.close_connection:
self.handle_one_request()
可以看到,它主要是去呼叫了handle_one_request方法,並且可以在同一個連線中處理多個request。這個類的handle_one_request方法我們不用管,在WSGIRequestHandler中我們會重寫這個方法。重點說一下WSGIRequestHandler
WSGIRequestHandler
繼承自BaseHTTPRequestHandler。Flask沒有自定義請求處理類,使用了WSGI庫的WSGIRequestHandler。
類方法make_environ。一個重要的方法,用來建立上下文環境。這個暫且不說。
類方法handle_one_request
def handle_one_request(self):
"""Handle a single HTTP request."""
self.raw_requestline = self.rfile.readline()
if not self.raw_requestline:
self.close_connection = 1
elif self.parse_request():
return self.run_wsgi()
呼叫run_wsgi方法來處理
def run_wsgi(self):
if self.headers.get('Expect', '').lower().strip() == '100-continue':
self.wfile.write(b'HTTP/1.1 100 Continue\r\n\r\n')
self.environ = environ = self.make_environ()
headers_set = []
headers_sent = []
def write(data):
assert headers_set, 'write() before start_response'
if not headers_sent:
status, response_headers = headers_sent[:] = headers_set
try:
code, msg = status.split(None, 1)
except ValueError:
code, msg = status, ""
self.send_response(int(code), msg)
header_keys = set()
for key, value in response_headers:
self.send_header(key, value)
key = key.lower()
header_keys.add(key)
if 'content-length' not in header_keys:
self.close_connection = True
self.send_header('Connection', 'close')
if 'server' not in header_keys:
self.send_header('Server', self.version_string())
if 'date' not in header_keys:
self.send_header('Date', self.date_time_string())
self.end_headers()
assert isinstance(data, bytes), 'applications must write bytes'
self.wfile.write(data)
self.wfile.flush()
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
if headers_sent:
reraise(*exc_info)
finally:
exc_info = None
elif headers_set:
raise AssertionError('Headers already set')
headers_set[:] = [status, response_headers]
return write
def execute(app):
application_iter = app(environ, start_response)
try:
for data in application_iter:
write(data)
if not headers_sent:
write(b'')
finally:
if hasattr(application_iter, 'close'):
application_iter.close()
application_iter = None
try:
execute(self.server.app)
except (socket.error, socket.timeout) as e:
self.connection_dropped(e, environ)
except Exception:
if self.server.passthrough_errors:
raise
from werkzeug.debug.tbtools import get_current_traceback
traceback = get_current_traceback(ignore_system_exceptions=True)
try:
# if we haven't yet sent the headers but they are set
# we roll back to be able to set them again.
if not headers_sent:
del headers_set[:]
execute(InternalServerError())
except Exception:
pass
self.server.log('error', 'Error on request:\n%s',
traceback.plaintext)
這個函式比較長,我們只看程式的主幹部分,execute(self.server.app)==》application_iter = app(environ, start_response)。app之前已經說過,在建立BaseWSGIServer的時候就把Flask的例項物件傳入。這裡你可能會疑惑,一個物件怎麼還能帶引數的呼叫?這是屬於python的語法,只要你在Flask類中定義了__call__方法,當你這樣使用的時候,實際上呼叫的是Flask的__call__方法,類似的是,所有的函式都有__call__方法,這裡不再贅述。那麼,Flask.__call__到底做什麼了呢?我們來看。
def __call__(self, environ, start_response):
"""Shortcut for :attr:`wsgi_app`."""
return self.wsgi_app(environ, start_response)
def wsgi_app(self, environ, start_response):
ctx = self.request_context(environ)
ctx.push()
error = None
try:
try:
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.make_response(self.handle_exception(e))
return response(environ, start_response)
finally:
if self.should_ignore_error(error):
error = None
ctx.auto_pop(error)
分析wsgi_app方法。
前邊兩行是建立上下文變數,self.request_context(environ)主要做兩件事,第一建立url介面卡,第二url介面卡去匹配客戶端請求路徑self.path,然後存放到environ中的PATH_INFO變數中。其中會執行到WSGIRequestHandler的make_environ方法,這裡不再敘述。
最重要的處理請求的環節,response = self.full_dispatch_request()。
def full_dispatch_request(self):
"""Dispatches the request and on top of that performs request
pre and postprocessing as well as HTTP exception catching and
error handling.
.. versionadded:: 0.7
"""
self.try_trigger_before_first_request_functions()
try:
request_started.send(self)
rv = self.preprocess_request()
if rv is None:
rv = self.dispatch_request()
except Exception as e:
rv = self.handle_user_exception(e)
response = self.make_response(rv)
response = self.process_response(response)
request_finished.send(self, response=response)
return response
self.try_trigger_before_first_request_functions()是在正式處理請求之前執行before_first_request請求鉤子註冊的函式,在處理第一個請求之前執行。
然後self.preprocess_request執行before_request請求鉤子註冊的函式,在每次請求之前執行。
self.dispatch_request重頭戲來了,就是這個方法來處理http請求的。怎麼處理呢?
def dispatch_request(self):
"""Does the request dispatching. Matches the URL and returns the
return value of the view or error handler. This does not have to
be a response object. In order to convert the return value to a
proper response object, call :func:`make_response`.
.. versionchanged:: 0.7
This no longer does the exception handling, this code was
moved to the new :meth:`full_dispatch_request`.
"""
req = _request_ctx_stack.top.request
if req.routing_exception is not None:
self.raise_routing_exception(req)
rule = req.url_rule
# if we provide automatic options for this URL and the
# request came with the OPTIONS method, reply automatically
if getattr(rule, 'provide_automatic_options', False) \
and req.method == 'OPTIONS':
return self.make_default_options_response()
# otherwise dispatch to the handler for that endpoint
return self.view_functions[rule.endpoint](**req.view_args)
首先自然是獲取請求引數req。self.view_functions是什麼東西呢?就是存放檢視與函式的對映表,即你用@app.route('/', methods=['GET','POST'])的時候存放的對映資訊。我們可以來看一下app.route這個裝飾器。
def route(self, rule, **options):
"""A decorator that is used to register a view function for a
given URL rule. This does the same thing as :meth:`add_url_rule`
but is intended for decorator usage::
@app.route('/')
def index():
return 'Hello World'
For more information refer to :ref:`url-route-registrations`.
:param rule: the URL rule as string
:param endpoint: the endpoint for the registered URL rule. Flask
itself assumes the name of the view function as
endpoint
:param options: the options to be forwarded to the underlying
:class:`~werkzeug.routing.Rule` object. A change
to Werkzeug is handling of method options. methods
is a list of methods this rule should be limited
to (`GET`, `POST` etc.). By default a rule
just listens for `GET` (and implicitly `HEAD`).
Starting with Flask 0.6, `OPTIONS` is implicitly
added and handled by the standard request handling.
"""
def decorator(f):
endpoint = options.pop('endpoint', None)
self.add_url_rule(rule, endpoint, f, **options)
return f
return decorator
def add_url_rule(self, rule, endpoint=None, view_func=None, **options):
if endpoint is None:
endpoint = _endpoint_from_view_func(view_func)
options['endpoint'] = endpoint
methods = options.pop('methods', None)
# if the methods are not given and the view_func object knows its
# methods we can use that instead. If neither exists, we go with
# a tuple of only `GET` as default.
if methods is None:
methods = getattr(view_func, 'methods', None) or ('GET',)
methods = set(methods)
# Methods that should always be added
required_methods = set(getattr(view_func, 'required_methods', ()))
# starting with Flask 0.8 the view_func object can disable and
# force-enable the automatic options handling.
provide_automatic_options = getattr(view_func,
'provide_automatic_options', None)
if provide_automatic_options is None:
if 'OPTIONS' not in methods:
provide_automatic_options = True
required_methods.add('OPTIONS')
else:
provide_automatic_options = False
# Add the required methods now.
methods |= required_methods
# due to a werkzeug bug we need to make sure that the defaults are
# None if they are an empty dictionary. This should not be necessary
# with Werkzeug 0.7
options['defaults'] = options.get('defaults') or None
rule = self.url_rule_class(rule, methods=methods, **options)
rule.provide_automatic_options = provide_automatic_options
self.url_map.add(rule)
if view_func is not None:
old_func = self.view_functions.get(endpoint)
if old_func is not None and old_func != view_func:
raise AssertionError('View function mapping is overwriting an '
'existing endpoint function: %s' % endpoint)
self.view_functions[endpoint] = view_func
請求處理類的例項化:
當select監聽到連線到來時,呼叫http服務的_handle_request_noblock方法(由BaseServer實現),==》process_request==》finish_request==》self.RequestHandlerClass(request, client_address, self)例項化請求處理類。
下載原始碼:git clone [email protected]:mitsuhiko/flask.git
注:本文旨在敘述flask的工作流程,所以只講了最基礎的BaseWSGIServer,而沒有講ThreadedWSGIServer等。