1. 程式人生 > >Under the hood of TCP Socket Implementation On Golang

Under the hood of TCP Socket Implementation On Golang

Under the hood of TCP Socket Implementation On Golang

Golang is surely my first to go language to write web application, it hides many details but still gives a lot of flexibility. Recently I did strace to an http service app, actually did it without a reason but I found something interesting.

Here’s what strace -c

gave me :

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
91.24 0.397615 336 1185 29 futex
4.13 0.018009 3 7115 clock_gettime
2.92 0.012735 19 654 epoll_wait
1.31 0.005701 6 911 write
0.20 0.000878 3 335 epoll_ctl
0.12 0.000525 1 915 457 read
0.02 0.000106 2 59 select
0.01 0.000059 0 170 close
0.01 0.000053 0 791 setsockopt
0.01 0.000035 0 158 getpeername
0.01 0.000034 0 170 socket
0.01 0.000029 0 160 getsockname
0.01 0.000026 0 159 getsockopt
0.00 0.000000 0 7 sched_yield
0.00 0.000000 0 166 166 connect
0.00 0.000000 0 3 1 accept4
------ ----------- ----------- --------- --------- ----------------
100.00 0.435805 12958 653 total

There are a lot of interesting things in that profiling result, but the highlight of this article are the error number of read and the number of futex call

At the first sight I didn’t think about futex call, which most of them are a wake call. I thought that this app serve hundreds req per second, it should has many go routine; At the other hand it also leverage channel, it should has many under the hood blocking mechanism. So a lot of futex

call should be normal, later I found out that this number was also from other things -will back to it later-.

Why you no read

Who’s like error? Worst when it happens hundreds time in less than a minute, that’s what I thought when looking to that profiling result. What the heck with that read call?

read(36, "GET /xxx/v3?q=xx%20ch&d"..., 4096) = 520
...
read(36, 0xc422aa4291, 1) = -1 EAGAIN (Resource temporarily unavailable)

After every read call there is (maybe) always one read to the same file descriptor with EAGAIN error. Hey I remember this error, it comes when the file descriptor is not ready to certain operation; in this case read . But Why golang do this way?

I thought there was bug from the epoll_wait which gave wrong ready event to every fd? every fd? It looks like double ready rather than wrong event, why double?

Honestly my experience with epoll is shallow, basic event loop socket handler [0] is the only one. No multi threading, no synchronization, it’s very plain. After googling I found Marek (clever guy) criticizing epoll [1], and it’s like boom!!!

Important TL;DR from that post : unnecessary wake up is ‘generally’ unavoidable when using epoll with multi threaded worker, because weneed to notify each worker about waiting ready events.

Hey that also explain about our futex wake numbers. But lets take a look at simplified version on how to use epoll in event loop socket implementation to further understand it :

  1. Bind socket listener to file descriptor, lets call it s_fd
  2. Create epoll file descriptor using epoll_create, lets call it e_fd
  3. Bind s_fd to e_fd using epol_ctl for particular events (usually bothEPOLLIN|EPOLLOUT )
  4. Create forever loop (event loop) which call epoll_wait in every iteration to get incoming ready connections fd from binded events
  5. Handle the ready connection, in multi worker implementation means notifying each worker

Using strace I found that golang using edge triggered epoll

epoll_ctl(4, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2490298448, u64=140490870550608}}) = 0

Which means this is also valid on golang socket implementation :

  1. Kernel: Receives a new connection.
  2. Kernel: Notifies two waiting threads A and B. Due to “thundering herd” behavior with level-triggered notifications kernel must wake up both.
  3. Thread A: Finishes epoll_wait().
  4. Thread B: Finishes epoll_wait().
  5. Thread A: Performs accept(), this succeeds.
  6. Thread B: Performs accept(), this fails with EAGAIN.

At this point I am 80% confident this was the case, but let’s do some analysis using this simple web app [2]. Here is what it does on single request.

epoll_wait(4, [{EPOLLIN|EPOLLOUT, {u32=2186919600, u64=140542106779312}}], 128, -1) = 1
futex(0x7c1bd8, FUTEX_WAKE, 1) = 1
futex(0x7c1b10, FUTEX_WAKE, 1) = 1
read(5, "GET / HTTP/1.1\r\nHost: localhost:"..., 4096) = 348
futex(0xc420060110, FUTEX_WAKE, 1) = 1
write(5, "HTTP/1.1 200 OK\r\nDate: Sat, 03 J"..., 116) = 116
futex(0xc420060110, FUTEX_WAKE, 1) = 1
read(5, 0xc4200f6000, 4096) = -1 EAGAIN (Resource temporarily unavailable)

Look how after epoll_wait there are two futex wake call which I think is the worker and then there are error read.

How if I set GOMAXPROCS to 1, enforcing single worker behavior.

epoll_wait(4,[{EPOLLIN, {u32=1969377136, u64=140245536493424}}], 128, -1) = 1
futex(0x7c1bd8, FUTEX_WAKE, 1) = 1
accept4(3, {sa_family=AF_INET6, sin6_port=htons(54400), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28], SOCK_CLOEXEC|SOCK_NONBLOCK) = 6
epoll_ctl(4, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=1969376752, u64=140245536493040}}) = 0
getsockname(6, {sa_family=AF_INET6, sin6_port=htons(8080), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
setsockopt(6, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(6, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(6, SOL_TCP, TCP_KEEPINTVL, [180], 4) = 0
setsockopt(6, SOL_TCP, TCP_KEEPIDLE, [180], 4) = 0
accept4(3, 0xc42004db78, 0xc42004db6c, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
read(6, "GET /test?kjhkjhkjh HTTP/1.1\r\nHo"..., 4096) = 92
write(6, "HTTP/1.1 200 OK\r\nDate: Sat, 03 J"..., 139) = 139
read(6, "", 4096) = 0

When using 1 worker, there is only once futex wake call after epoll_wait and no read error. However I found this behavior is not consistent, in some trial I still got read error and 2 futex wake call.

And then what to do?

In the same post [1] Marek talk about EPOLLEXCLUSIVE which available starting from linux version 4.5. I’m using version 4.8 why it’s still happening on my maching? Turns out that golang is not using that flag yet, I hope at the next next version it will support that flag when available.

I learned a lot from this journey, hope you too.

相關推薦

Under the hood of TCP Socket Implementation On Golang

Under the hood of TCP Socket Implementation On GolangGolang is surely my first to go language to write web application, it hides many details but still giv

Under the hood of Pixling World

Under the hood of Pixling WorldThis will be a look under the hood of Pixling World, an artificial life/evolution simulator/god simulator I’m building. As a

To make sense of A.I. decisions, ‘peek under the hood

To make sense of A.I. decisions, ‘peek under the hood’Now that humans have programmed computers to learn, we want to know exactly what they’ve learned and

【DATE2017】Double MAC: Doubling the Performance of Convolutional Neural Networks on Modern FPGAs

-1 資源 font 文章 討論 要點 兩個 需要 分享 這篇文章介紹了如何利用FPGA內部單個DSP來實現SIMD乘法,從而提高DSP利用率,緩解計算資源不足的問題,是一個比較實用的trick。 要點: 利用單個DSP並行實現兩次乘法:A*C、B*C; 文中只討論了A、

The impact of trade war expansion on Bitcoin

                                         Bitcoin coin is probably the single biggest winner of the US-China trade war, leaving investors

Under the hood: Airbnb

Under the hood: AirbnbThis blog series focuses on examining the collection of device data by various popular mobile applications. This data is often collec

Uber Under the Hood

Sharing the CurbCurb space in city centers is a scarce resource. And as demand for this space increases — with the rise of urban deliveries, shared mobilit

A study analyzes the impact of targeted Facebook advertising on the election: It is estimated that it increased the number of Tr

In order to carry out these campaigns, the messages are sent out to Facebook users based on various factors, such as their gender, location or political a

Under The Hood: Assembly instructions used often1

Apparently,there's quite a bit more interest in Win32® assembly language than I had originally thought. After the February 1998 issue ofMSJ hit the stands

On the Convergence of the TTL Approximation for an LRU Cache under Independent Stationary Request Pr

1.TTL和LRU之間的關係 當LRU的時間T設定為“特徵時間”時,LRU的 效能收斂於TTL( the performance of LRU converges to that of a TTL cache with a timer set to the character

The KEY Point of Coffee Lake Power on

coffee lake power-on me1, If the system can’t power up. a, please set PlatformImonDisable to 0x1 in xml file as following table b, check OEM Public Key Ha

A Survey on the Security of Stateful SDN Data Planes

語言 共享 虛擬 fault conf enter 完整性 鏈路 定義 論文摘要: 本文為讀者提供新興的SDN帶狀態數據平面,集中關註SDN數據平面編程性帶來的隱患。 I部分 介紹 A.帶狀態SDN數據平面的 B.帶狀態數據平面帶來的安全隱患   引出帶狀態數據平面的安全隱

三江源區高寒草地地上生物量遙感反演模型研究 Modeling Aboveground Biomass of Alpine Grassland in the Three-River Headwaters Region Based on Remote Sensing Data

col 評估 空格 類型 pan 擬合 mar [1] ESS 作  者:韓波高艷妮郭楊張林波王德旺徐良驥楊波 作者機構:[1]中國環境科學研究院,環境基準與風險評估國家重點實驗室,北京100012;[2]中國環境科學研究院,國家環境保護區域生態過程與功能評估重點實驗室,北

定向運動員識圖過程中視覺搜索特征研究 Study on Visual Search Characteristics of Orienteers in the Process of Map Reading

lec har tro 學院 收錄 專家 page 特點 sea 作  者:劉陽何勁鵬LIU YangHE Jin-peng(Xi‘an University of Technology,Xi‘an,710048,ChinaNortheast Normal Universi

Does the semantics of `std::memory_order_acquire` requires processor instructions on x86/x86_64?

https://stackoverflow.com/questions/18576986/does-the-semantics-of-stdmemory-order-acquire-requires-processor-instruction 7down votefavorite 4 I

上傳文件報錯:Warning: POST Content-Length of 9443117 bytes exceeds the limit of 8388608 bytes in Unknown on line 0

The lin clas php 但是 exce 小文件 情況 lan 只需在php.ini中設置: upload_max_filesize = 1000M; post_max_size = 1000M;還有一種情況就是小文件可以上傳成功,但是大文件上傳,$_FILES

On the Performance of MIMO-NOMA-Based Visible Light Communication Systems

讀後感: 今天讀了《On the Performance of MIMO-NOMA-Based Visible Light Communication Systems》有感如下: 摘要:在本文中,我們應用non-orthogonal multiple access(NOMA)技術去提高基於多

【Network Architecture】Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning(轉) Feature Extractor[Inception v4]

文章來源: https://www.cnblogs.com/shouhuxianjian/p/7786760.html Feature Extractor[Inception v4] 0. 背景 隨著何凱明等人提出的ResNet v1,google這邊坐

Thoughts on the Application of Radar Technology to the Improvement of Street Light System

Abstract: Street lights are everywhere in people’s daily lives. But street lights can sometimes cause problems for people. First of all, long-term

Warning: POST Content-Length of 16767309 bytes exceeds the limit of 8388608 bytes in Unknown on line

使用xampp和WordPress,上傳新主題時報錯。 錯誤資訊: Warning: POST Content-Length of 16767309 bytes exceeds the limit of 8388608 bytes in Unknown on line&n