1. 程式人生 > 其它 >獲取請求報文_Flask Sentry 獲取原始請求報文

獲取請求報文_Flask Sentry 獲取原始請求報文

技術標籤:獲取請求報文

e9f57ba5a9a57a52c9aa1476293e89c3.png

Flask Sentry

Sentry

Users and logs provide clues. Sentry provides answers.

What's Sentry?

Sentry fundamentally is a service that helps you monitor and fix crashes in realtime. The server is in Python, but it contains a full API for sending events from any language, in any application. https:// github.com/getsentry/se ntry

這裡我們就不詳細介紹,具體內容見官網,簡單理解是一個面向主流語言的開源錯誤日誌收集服務

Flask use Sentry

flask 如何使用 Sentry?
見官網,基本接入非常簡單,兩行程式碼級別就搞定

Problem

接入 Sentry 之後我們發現使用原先 flask.Request.get_data 方法無法獲取原始報文!

get_data(cache=True, as_text=False, parse_form_data=False)

This reads the buffered incoming data from the client into one bytestring. By default this is cached but that behavior can be changed by setting cache to False.
Usually it’s a bad idea to call this method without checking the content length first as a client could send dozens of megabytes or more to cause memory problems on the server.
Note that if the form data was already parsed this method will not return anything as form data parsing does not cache the data like this method does. To implicitly invoke form data parsing function set parse_form_data to True. When this is done the return value of this method will be an empty string if the form parser handles the data. This generally is not necessary as if the whole data is cached (which is the default) the form parser will used the cached data to parse the form data. Please be generally aware of checking the content length first in any case before calling this method to avoid exhausting server memory.
If as_text is set to True the return value will be a decoded unicode string.

檢視 Flask Sentry 的原始碼發現,在每個請求之前 Sentry 會記錄請求資訊,想想他要實現的功能也應該可以預見這個實現。
實現中會訪問 request.form 或者 request.data 原始碼

werkzeug.wrappers.BaseRequest.data

Contains the incoming request data as string in case it came with a mimetype Werkzeug does not handle.

那也就是說 Sentry 會先於我們自己的程式碼獲取 request.data,同時當 request.data 無法被 utf8 編碼的情況下,拋棄掉這些內容,之後我們獲取的內容就為空了。

Resolve

方案一

https://github.com/getsentry/raven-python/issues/457, 我們一定不是第一個遇到這個問題的人

@app.before_request
def enable_form_raw_cache():
  if request.path.startswith('/redacted'):
    if request.content_length > 1024 * 1024:  # 1mb
      abort(413)  # Payload too large
    request.get_data(parse_form_data=False, cache=True)

方案二

@app.before_request
def enable_form_raw_cache():
    cache_path_list = [
        '/PATH_FOO',
        '/PATH_BAR',
    ]
    path = request.path
    if any([path.startswith(i) for i in cache_path_list]):
        request.get_data(parse_form_data=False, cache=True)

總體思路都是在 Sentry 訪問 request.data 之前把它先快取起來,並且是選擇性的快取起來

N more things

更多的思考

  • 為什麼會有不是 utf8 編碼的資料?
    • 這個是開玩笑了,因為我們要接受 GBK 編碼的 XML 資料,這個的根本原因我們就不細談了。
  • 為什麼 flask 不快取所有的原始物件?
    • 這應該是個好問題,可能的原因,太多的原始物件消耗記憶體。同時因為已經將請求資料從流物件中讀出,然後結構化了,也就沒有必要儲存原始物件了