1. 程式人生 > >golang net/http包部分實現原理詳解

golang net/http包部分實現原理詳解

net/http包在編寫golang web應用中有很重要的作用,它主要提供了基於HTTP協議進行工作的client實現和server實現,可用於編寫HTTP服務端和客戶端。

其使用方法也跟其他面嚮物件語言很相似,我們可以先從它的一些基礎用法來感受一下:

以下是示例客戶端程式碼:

// 用http包中預設的Client進行請求(因為沒有指定Client)
resp, err := http.Get("http://example.com/")
...
resp, err := http.Post("http://example.com/upload", "image/jpeg", &buf)
...
resp,
err := http.PostForm("http://example.com/form", url.Values{"key": {"Value"}, "id": {"123"}})
// 用自定義的Client進行請求
client := &http.Client{
	CheckRedirect: redirectPolicyFunc,
}

resp, err := client.Get("http://example.com")
// ...

req, err := http.NewRequest("GET", "http://example.com", nil)
// ...
req.Header.
Add("If-None-Match", `W/"wyzzy"`) resp, err := client.Do(req) // ...

客戶端主要是利用了http.Client物件來進行各種請求的傳送。上面兩個用法中,第一個相對直接,它可以直接用Get函式或Post函式來發出請求,而第二個用法則是分步來完成請求過程:

req, err := http.NewRequest("GET", "http://example.com", nil)
// ...
req.Header.Add("If-None-Match", `W/"wyzzy"`)
resp, err := client.Do(req)

先是用http.NewRequest方法指定了請求方法以及請求url構造請求物件req,並進一步對req的Header進行設定,最後再通過Client.Do方法將請求發出去。

再來看看以下服務端示例程式碼:

// 這裡沒有指定server,所以是用http包中預設的server來處理客戶端請求
http.Handle("/foo", fooHandler)

http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
})

log.Fatal(http.ListenAndServe(":8080", nil))
// 自定義server
s := &http.Server{
	Addr:           ":8080",
	Handler:        myHandler,
	ReadTimeout:    10 * time.Second,
	WriteTimeout:   10 * time.Second,
	MaxHeaderBytes: 1 << 20,
}
log.Fatal(s.ListenAndServe())

服務端程式碼主要是用http.Server物件來進行請求的處理,第一種用法中並沒有構造Server物件,而是使用http提供的預設Server物件,通過http.Handle指定了匹配"/foo"的請求的Handler,通過HandleFunc方法來指定匹配“/bar”的請求的處理函式。最後通過http.ListenAndServe方法啟動預設的Server物件,並在8080埠監聽客戶端請求。而第二種方法通過建立Server物件對一些屬性進行了指定。

Server物件的實現

我們以net/http/server.go為例來分析Server物件。Server的物件的定義如下:

// A Server defines parameters for running an HTTP server.
// The zero value for Server is a valid configuration.
type Server struct {
	Addr    string  // TCP address to listen on, ":http" if empty
	Handler Handler // handler to invoke, http.DefaultServeMux if nil

	// TLSConfig optionally provides a TLS configuration for use
	// by ServeTLS and ListenAndServeTLS. Note that this value is
	// cloned by ServeTLS and ListenAndServeTLS, so it's not
	// possible to modify the configuration with methods like
	// tls.Config.SetSessionTicketKeys. To use
	// SetSessionTicketKeys, use Server.Serve with a TLS Listener
	// instead.
	TLSConfig *tls.Config

	// ReadTimeout is the maximum duration for reading the entire
	// request, including the body.
	//
	// Because ReadTimeout does not let Handlers make per-request
	// decisions on each request body's acceptable deadline or
	// upload rate, most users will prefer to use
	// ReadHeaderTimeout. It is valid to use them both.
	ReadTimeout time.Duration

	// ReadHeaderTimeout is the amount of time allowed to read
	// request headers. The connection's read deadline is reset
	// after reading the headers and the Handler can decide what
	// is considered too slow for the body.
	ReadHeaderTimeout time.Duration

	// WriteTimeout is the maximum duration before timing out
	// writes of the response. It is reset whenever a new
	// request's header is read. Like ReadTimeout, it does not
	// let Handlers make decisions on a per-request basis.
	WriteTimeout time.Duration

	// IdleTimeout is the maximum amount of time to wait for the
	// next request when keep-alives are enabled. If IdleTimeout
	// is zero, the value of ReadTimeout is used. If both are
	// zero, ReadHeaderTimeout is used.
	IdleTimeout time.Duration

	// MaxHeaderBytes controls the maximum number of bytes the
	// server will read parsing the request header's keys and
	// values, including the request line. It does not limit the
	// size of the request body.
	// If zero, DefaultMaxHeaderBytes is used.
	MaxHeaderBytes int

	// TLSNextProto optionally specifies a function to take over
	// ownership of the provided TLS connection when an NPN/ALPN
	// protocol upgrade has occurred. The map key is the protocol
	// name negotiated. The Handler argument should be used to
	// handle HTTP requests and will initialize the Request's TLS
	// and RemoteAddr if not already set. The connection is
	// automatically closed when the function returns.
	// If TLSNextProto is not nil, HTTP/2 support is not enabled
	// automatically.
	TLSNextProto map[string]func(*Server, *tls.Conn, Handler)

	// ConnState specifies an optional callback function that is
	// called when a client connection changes state. See the
	// ConnState type and associated constants for details.
	ConnState func(net.Conn, ConnState)

	// ErrorLog specifies an optional logger for errors accepting
	// connections, unexpected behavior from handlers, and
	// underlying FileSystem errors.
	// If nil, logging is done via the log package's standard logger.
	ErrorLog *log.Logger

	disableKeepAlives int32     // accessed atomically.
	inShutdown        int32     // accessed atomically (non-zero means we're in Shutdown)
	nextProtoOnce     sync.Once // guards setupHTTP2_* init
	nextProtoErr      error     // result of http2.ConfigureServer if used

	mu         sync.Mutex
	listeners  map[*net.Listener]struct{}
	activeConn map[*conn]struct{}
	doneChan   chan struct{}
	onShutdown []func()
}

Server物件包含了多個屬性,其中較為常用的兩個屬性是Addr和Handler屬性。Addr屬性即是Server物件監聽的地址(埠號)。示例程式碼中的這一行:

http.ListenAndServe(":8080", nil)

傳遞給ListenAndServe方法的“:8080"引數即是用於設定Server物件的Addr屬性。ListenAndServe方法的原始碼如下:

// ListenAndServe always returns a non-nil error.
func ListenAndServe(addr string, handler Handler) error {
	server := &Server{Addr: addr, Handler: handler}
	return server.ListenAndServe()
}

ListenAndServe方法根據Addr和Handler引數構建了一個Server物件,並呼叫該物件的ListenAndServe方法進行監聽和響應。那麼第二個引數Handler的作用是什麼呢,為什麼傳了nil值?

Handler的定義如下:

type Handler interface {
	ServeHTTP(ResponseWriter, *Request)
}

Handler是一個介面型別,包含了ServerHTTP方法,該方法對客戶端的請求(Request物件)進行處理,並通過ResponseWriter將響應資訊傳送回客戶端。可以Handler是Server物件處理請求的邏輯的關鍵所在。但是上面ListenAndServe方法的Handler引數為nil,Handler為Nil的Server物件如何實現處理請求的邏輯呢?

注意到Server物件的Handler屬性的註釋有提到如果Handler值為nil,那麼Server物件將使用http.DefaultServeMux來處理請求。

Handler Handler // handler to invoke, http.DefaultServeMux if nil

http.DefaultServeMux是一個ServeMux物件,ServeMux物件是一個HTTP請求的分發器,將匹配了相應URL的請求分發給相應的處理函式進行處理。其定義如下:

// ServeMux is an HTTP request multiplexer.
// It matches the URL of each incoming request against a list of registered
// patterns and calls the handler for the pattern that
// most closely matches the
// ...
type ServeMux struct {
	mu    sync.RWMutex
	m     map[string]muxEntry
	hosts bool // whether any patterns contain hostnames
}

type muxEntry struct {
	h       Handler
	pattern string
}

ServeMux正是通過map結構(map[string]muxEntry)來匹配請求和Handler的,起到了路由的作用。並且ServeMux本身也是一個Handler,因為它實現了Handler介面的ServeHTTP方法。

再回到服務端的示例程式碼:

// 這裡沒有指定server,所以是用http包中預設的server來處理客戶端請求
http.Handle("/foo", fooHandler)

http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
})

log.Fatal(http.ListenAndServe(":8080", nil))

我們呼叫了http.Handle方法和HandleFunc方法的時候其實是將"/foo",”/bar"與相應的Handler的對應關係存到http.DefaultServeMux物件中。所以當Server物件的Handler屬性為nil時,Server物件便可以通過http.DefaultServeMux物件來實現請求的處理邏輯。

Client物件的實現

我們以net/http/client.go為例來分析Client物件。Client物件的定義如下:

// A Client is an HTTP client. Its zero value (DefaultClient) is a
// usable client that uses DefaultTransport.
//
// The Client's Transport typically has internal state (cached TCP
// connections), so Clients should be reused instead of created as
// needed. Clients are safe for concurrent use by multiple goroutines.
//
// A Client is higher-level than a RoundTripper (such as Transport)
// and additionally handles HTTP details such as cookies and
// redirects.
//
// When following redirects, the Client will forward all headers set on the
// initial Request except:
//
// • when forwarding sensitive headers like "Authorization",
// "WWW-Authenticate", and "Cookie" to untrusted targets.
// These headers will be ignored when following a redirect to a domain
// that is not a subdomain match or exact match of the initial domain.
// For example, a redirect from "foo.com" to either "foo.com" or "sub.foo.com"
// will forward the sensitive headers, but a redirect to "bar.com" will not.
//
// • when forwarding the "Cookie" header with a non-nil cookie Jar.
// Since each redirect may mutate the state of the cookie jar,
// a redirect may possibly alter a cookie set in the initial request.
// When forwarding the "Cookie" header, any mutated cookies will be omitted,
// with the expectation that the Jar will insert those mutated cookies
// with the updated values (assuming the origin matches).
// If Jar is nil, the initial cookies are forwarded without change.
//
type Client struct {
	// Transport specifies the mechanism by which individual
	// HTTP requests are made.
	// If nil, DefaultTransport is used.
	Transport RoundTripper

	// CheckRedirect specifies the policy for handling redirects.
	// If CheckRedirect is not nil, the client calls it before
	// following an HTTP redirect. The arguments req and via are
	// the upcoming request and the requests made already, oldest
	// first. If CheckRedirect returns an error, the Client's Get
	// method returns both the previous Response (with its Body
	// closed) and CheckRedirect's error (wrapped in a url.Error)
	// instead of issuing the Request req.
	// As a special case, if CheckRedirect returns ErrUseLastResponse,
	// then the most recent response is returned with its body
	// unclosed, along with a nil error.
	//
	// If CheckRedirect is nil, the Client uses its default policy,
	// which is to stop after 10 consecutive requests.
	CheckRedirect func(req *Request, via []*Request) error

	// Jar specifies the cookie jar.
	//
	// The Jar is used to insert relevant cookies into every
	// outbound Request and is updated with the cookie values
	// of every inbound Response. The Jar is consulted for every
	// redirect that the Client follows.
	//
	// If Jar is nil, cookies are only sent if they are explicitly
	// set on the Request.
	Jar CookieJar

	// Timeout specifies a time limit for requests made by this
	// Client. The timeout includes connection time, any
	// redirects, and reading the response body. The timer remains
	// running after Get, Head, Post, or Do return and will
	// interrupt reading of the Response.Body.
	//
	// A Timeout of zero means no timeout.
	//
	// The Client cancels requests to the underlying Transport
	// as if the Request's Context ended.
	//
	// For compatibility, the Client will also use the deprecated
	// CancelRequest method on Transport if found. New
	// RoundTripper implementations should use the Request's Context
	// for cancelation instead of implementing CancelRequest.
	Timeout time.Duration
}

// DefaultClient is the default Client and is used by Get, Head, and Post.
var DefaultClient = &Client{}

從Client物件定義上方的註釋我們可以知道Client是一個HTTP客戶端,且它的零值(即http.DefaultClient = &Client{})是個可用的預設客戶端。該預設客戶端使用的是DefaultTransport。

這也解釋了上面客戶端程式碼的示例用法中我們為什麼可以不定義Client物件就能夠直接呼叫http.Get和http.Post進行請求。

再往下看註釋,上面說到的DefaultTransport物件是一個Transport型別的物件,它維持了內部的狀態(快取TCP連線),因此當需要更多Client物件時應該重用而不是重新定義,否則會丟失了原有的內部狀態。而且Client物件是執行緒安全的,即支援被多個執行緒安全地同時使用。

另外與RoundTripper物件相比,Client物件更加高階,它還負責處理HTTP細節資訊如cookies和重定向。

Client物件包含了以下四個欄位:

type Client struct {
	Transport RoundTripper
	CheckRedirect func(req *Request, via []*Request) error
	Jar CookieJar
	Timeout time.Duration
}

Transport欄位是RoundTripper型別的物件,它決定了發出每個HTTP請求所使用的機制。如果為nil值則使用http.DefaultTransport。

CheckRedirect是個用來處理重定向的請求的函式,Jar跟cookie相關,Timeout跟連線的超時值相關。

跟Client傳送請求關係最密切的就是RoundTripper,其定義如下:

type RoundTripper interface {
	// RoundTrip executes a single HTTP transaction, returning
	// a Response for the provided Request.
	//
	// RoundTrip should not attempt to interpret the response. In
	// particular, RoundTrip must return err == nil if it obtained
	// a response, regardless of the response's HTTP status code.
	// A non-nil err should be reserved for failure to obtain a
	// response. Similarly, RoundTrip should not attempt to
	// handle higher-level protocol details such as redirects,
	// authentication, or cookies.
	//
	// RoundTrip should not modify the request, except for
	// consuming and closing the Request's Body. RoundTrip may
	// read fields of the request in a separate goroutine. Callers
	// should not mutate or reuse the request until the Response's
	// Body has been closed.
	//
	// RoundTrip must always close the body, including on errors,
	// but depending on the implementation may do so in a separate
	// goroutine even after RoundTrip returns. This means that
	// callers wanting to reuse the body for subsequent requests
	// must arrange to wait for the Close call before doing so.
	//
	// The Request's URL and Header fields must be initialized.
	RoundTrip(*Request) (*Response, error)
}

RoundTripper同樣是個介面,包含了RoundTrip方法,接受Request物件然後返回Response物件。RoundTrip方法負責執行一次HTTP請求/接受響應的過程,而且它不負責對響應的內容進行解析,只關心響應是否為空,若為空則返回error。對響應內容的解析是由更高一層的Client物件來處理。

接下來看Client物件的方法實現,Client物件包含的一些常用公有方法如下:

func (c *Client) Get(url string) (resp *Response, err error) {
	req, err := NewRequest("GET", url, nil)
	if err != nil {
		return nil, err
	}
	return c.Do(req)
}


func (c *Client) Post(url, contentType string, body io.Reader) (resp *Response, err error) {
	req, err := NewRequest("POST", url, body)
	if err != nil {
		return nil, err
	}
	req.Header.Set("Content-Type", contentType)
	return c.Do(req)
}

Client物件的Get和Post方法的實現都比較相似,先是構造包含請求方法和url,請求體(body)的Request物件(通過NewRequest方法構造),然後再通過Client.Do方法執行該請求。

Do方法的相關程式碼如下:

func (c *Client) Do(req *Request) (*Response, error) {
	return c.do(req)
}

var testHookClientDoResult func(retres *Response, reterr error)

func (c *Client) do(req *Request) (retres *Response, reterr error) {
	// 進行一些檢查
	if testHookClientDoResult != nil {
		defer func() { testHookClientDoResult(retres, reterr) }()
	}
	if req.URL == nil {
		req.closeBody()
		return nil, &url.Error{
			Op:  urlErrorOp(req.Method),
			Err: errors.New("http: nil Request.URL"),
		}
	}
	// 定義相關變數(重點關注reqs及引數req)
	var (
		deadline      = c.deadline()
		reqs          []*Request
		resp          *Response
		copyHeaders   = c.makeHeadersCopier(req)
		reqBodyClosed = false // have we closed the current req.Body?

		// Redirect behavior:
		redirectMethod string
		includeBody    bool
	)
	uerr := func(err error) error {
		// the body may have been closed already by c.send()
		if !reqBodyClosed {
			req.closeBody()
		}
		var urlStr string
		if resp != nil && resp.Request != nil {
			urlStr = stripPassword(resp.Request.URL)
		} else {
			urlStr = stripPassword(req.URL)
		}
		return &url.Error{
			Op:  urlErrorOp(reqs[0].Method),
			URL: urlStr,
			Err: err,
		}
	}
	// 通過迴圈,從原始請求req開始執行,得到響應後判斷是否應該重定向,若需要
	// 則更新req物件傳送新的請求,直至不需要再重定向或出現錯誤為止
	for {
		// For all but the first request, create the next
		// request hop and replace req.
		if len(reqs) > 0 {
			loc := resp.Header.Get("Location")
			if loc == "" {
				resp.closeBody()
				return nil, uerr(fmt.Errorf("%d response missing Location header", resp.StatusCode))
			}
			u, err := req.URL.Parse(loc)
			if err != nil {
				resp.closeBody()
				return nil, uerr(fmt.Errorf("failed to parse Location header %q: %v", loc, err))
			}
			host := ""
			if req.Host != "" && req.Host != req.URL.Host {
				// If the caller specified a custom Host header and the
				// redirect location is relative, preserve the Host header
				// through the redirect. See issue #22233.
				if u, _ := url.Parse(loc); u != nil && !u.IsAbs() {
					host = req.Host
				}
			}
			ireq := reqs[0]
			req = &Request{
				Method:   redirectMethod,
				Response: resp,
				URL:      u,
				Header:   make(Header),
				Host:     host,
				Cancel:   ireq.Cancel,
				ctx:      ireq.ctx,
			}
			if includeBody && ireq.GetBody != nil {
				req.Body, err = ireq.GetBody()
				if err != nil {
					resp.closeBody()
					return nil, uerr(err)
				}
				req.ContentLength = ireq.ContentLength
			}

			// Copy original headers before setting the Referer,
			// in case the user set Referer on their first request.
			// If they really want to override, they can do it in
			// their CheckRedirect func.
			copyHeaders(req)

			// Add the Referer header from the most recent
			// request URL to the new one, if it's not https->http:
			if ref := refererForURL(reqs[len(reqs)-1].URL, req.URL); ref != "" {
				req.Header.Set("Referer", ref)
			}
			err = c.checkRedirect(req, reqs)

			// Sentinel error to let users select the
			// previous response, without closing its
			// body. See Issue 10069.
			if err == ErrUseLastResponse {
				return resp, nil
			}

			// Close the previous response's body. But
			// read at least some of the body so if it's
			// small the underlying TCP connection will be
			// re-used. No need to check for errors: if it
			// fails, the Transport won't reuse it anyway.
			const maxBodySlurpSize = 2 << 10
			if resp.ContentLength == -1 || resp.ContentLength <= maxBodySlurpSize {
				io.CopyN(ioutil.Discard, resp.Body, maxBodySlurpSize)
			}
			resp.Body.Close()

			if err != nil {
				// Special case for Go 1 compatibility: return both the response
				// and an error if the CheckRedirect function failed.
				// See https://golang.org/issue/3795
				// The resp.Body has already been closed.
				ue := uerr(err)
				ue.(*url.Error).URL = loc
				return resp, ue
			}
		}
		
		// 通過send方法將req請求傳送出去,並通過響應內容判斷是否需要重定向。
		// 若需要重定向,則進入下一個迴圈將req替換為一個新的req物件用於發生新的請求到重定向所指向的地址
		// 若不需要重定向,則返回響應結果
		reqs = append(reqs, req)
		var err error
		var didTimeout func() bool
		if resp, didTimeout, err = c.send(req, deadline); err != nil {
			// c.send() always closes req.Body
			reqBodyClosed = true
			if !deadline.IsZero() && didTimeout() {
				err = &httpError{
					// TODO: early in cycle: s/Client.Timeout exceeded/timeout or context cancelation/
					err:     err.Error() + " (Client.Timeout exceeded while awaiting headers)",
					timeout: true,
				}
			}
			return nil, uerr(err)
		}

		var shouldRedirect bool
		redirectMethod, shouldRedirect, includeBody = redirectBehavior