HttpClient在多線程環境下踩坑總結
問題現場
在多線程環境下使用HttpClient組件對某個HTTP服務發起請求,運行一段時間之後發現客戶端主機CPU利用率呈現出下降趨勢,而不是一個穩定的狀態。
而且,從程序日誌中判斷有線程處於夯住的狀態,應該是被阻塞了。
問題排查
一開始找不到原因,懷疑是多線程並發導致的死鎖問題,但是通過代碼審查並未定位到任何可能的多線程並發問題。
甚至開始懷疑是否是因為內存資源不夠引起JVM頻繁GC到導致業務線程被暫停,但是從GC的日誌輸出結果看,GC是正常的。
於是,進入一種丈二和尚摸不著頭腦頭腦的狀態,再次Review代碼,發現並未設置請求超時時間,於是設置超時控制,發現問題依然存在,徹底懵逼了。
最後,dump線程堆棧和內存堆棧,再對堆棧數據進行分析。從分析結果看,確認是因為Socket連接在讀取數據時被阻塞引起線程夯住。搜索“httpclient 超時”關鍵字,找到各式各樣設置HttpClient超時控制的方式,均嘗試過但是並未生效。
問題重現
1.HttpClient版本
<dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.5.2</version> </dependency>
2.Java代碼
public class HttpClientTest { private AtomicInteger counter = new AtomicInteger(0); private String url = "http://www.baidu.com/"; public static void main(String[] args) { new HttpClientTest().test(); } // 執行測試 private void test() { int number = 100000; // 總請求數 int concurrent = 50; // 每次並發請求數 CountDownLatch countDownLatch = new CountDownLatch(number); // 計數器 ExecutorService threadPool = Executors.newFixedThreadPool(concurrent); // 線程池 int concurrentPer = concurrent; boolean over = false; while(!over) { number = number - concurrent; if(number <= 0) { concurrentPer = number + concurrent; over = true; } // 線程池批量提交 for(int i = 0; i < concurrentPer; i++) { threadPool.execute(new Runnable() { @Override public void run() { try { request(url); Thread.sleep(100); } catch (IOException | InterruptedException e) { e.printStackTrace(); } finally { countDownLatch.countDown(); } } }); } } try { countDownLatch.await(); threadPool.shutdown(); } catch (InterruptedException e) { e.printStackTrace(); } } // 訪問指定地址 private void request(String url) throws IOException { HttpGet httpGet = new HttpGet(url); commnicate(httpGet); } // 負責底層通信處理 private void commnicate(HttpRequestBase request) throws IOException { ResponseHandler<String> responseHandler = new ResponseHandler<String>() { @Override public String handleResponse(HttpResponse response) throws ClientProtocolException, IOException { return EntityUtils.toString(response.getEntity()); } }; HttpClient client = HttpClients.createDefault(); String body = client.execute(request, responseHandler); // 線程可能會在這裏被阻塞 System.out.println(String.format("body size: %s, counter: %s", body.length(), counter.incrementAndGet())); } }
運行上述代碼一段時間後很容易可以重現出問題,如下為運行控制臺信息:
並且線程全部夯住,進程無法正常結束.
查看端口狀態存在大量請求處於建立連接狀態(ESTABLISHED):
# netstat -anpt
tcp 0 0 172.17.7.81:56408 180.118.128.179:56033 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:57644 115.202.238.177:27016 ESTABLISHED 3847/java
tcp 0 0 172.17.7.81:36616 117.57.21.0:25719 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:59944 112.245.197.118:57220 ESTABLISHED 3847/java
tcp 0 0 172.17.7.81:48722 218.5.215.10:40835 ESTABLISHED 4007/java
tcp 0 0 172.17.7.81:52734 115.194.17.14:45210 ESTABLISHED 4007/java
tcp 0 0 172.17.7.81:60586 59.32.37.129:16637 ESTABLISHED 3686/java
tcp 0 0 172.17.7.81:36776 222.89.86.109:21667 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:51690 60.161.249.162:59039 ESTABLISHED 3927/java
tcp 0 0 172.17.7.81:42226 58.218.200.59:80 TIME_WAIT -
tcp 0 0 172.17.7.81:56566 117.70.47.194:40879 ESTABLISHED 3686/java
tcp 0 0 172.17.7.81:43266 182.120.202.204:45893 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:55630 60.169.223.16:21280 ESTABLISHED 3927/java
tcp 0 0 172.17.7.81:54922 60.168.81.26:25464 ESTABLISHED 3927/java
tcp 0 0 172.17.7.81:53352 112.252.97.83:53584 ESTABLISHED 3847/java
tcp 0 0 172.17.7.81:52684 113.121.242.43:14447 ESTABLISHED 3927/java
tcp 0 0 172.17.7.81:54750 113.121.241.168:45173 ESTABLISHED 3686/java
tcp 0 0 172.17.7.81:41510 113.105.202.106:47288 ESTABLISHED 4007/java
tcp 0 0 172.17.7.81:38804 121.232.148.62:57938 ESTABLISHED 3847/java
tcp 0 0 172.17.7.81:41468 113.105.202.106:47288 ESTABLISHED 3927/java
tcp 0 0 172.17.7.81:45444 123.163.81.185:22012 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:54810 113.121.241.168:45173 ESTABLISHED 4007/java
tcp 0 0 172.17.7.81:51542 175.153.23.147:20766 ESTABLISHED 3927/java
tcp 0 0 172.17.7.81:45644 218.5.215.10:40835 ESTABLISHED 4007/java
tcp 0 0 172.17.7.81:35730 116.53.197.198:30042 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:54738 113.121.241.168:45173 ESTABLISHED 3686/java
tcp 0 0 172.17.7.81:60600 59.32.37.129:16637 ESTABLISHED 3686/java
tcp 0 0 172.17.7.81:54862 113.121.241.168:45173 ESTABLISHED 4007/java
tcp 0 0 172.17.7.81:40980 115.225.153.215:17292 ESTABLISHED 3686/java
tcp 0 0 172.17.7.81:54166 123.149.162.129:18269 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:60712 120.35.190.184:33054 ESTABLISHED 3766/java
tcp 0 0 172.17.7.81:55802 106.42.211.65:59547 ESTABLISHED 3766/java
同時,分析線程堆棧信息(jstack -F -l pid > thread_stack.log
)可以看到如下信息:
"pool-1-thread-45" #55 prio=5 os_prio=0 tid=0x00007f78702df000 nid=0x33d5 runnable [0x00007f7830c1d000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139)
at org.chench.extra.HttpClientTest.commnicate(HttpClientTest.java:106) # 線程在這裏阻塞
at org.chench.extra.HttpClientTest.request(HttpClientTest.java:93)
at org.chench.extra.HttpClientTest.access$100(HttpClientTest.java:31)
at org.chench.extra.HttpClientTest$1.run(HttpClientTest.java:62)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- <0x0000000086d50638> (a java.util.concurrent.ThreadPoolExecutor$Worker)
從堆棧日誌中可以看到,線程處於RUNNABLE
狀態,並且阻塞在at org.chench.extra.HttpClientTest.commnicate(HttpClientTest.java:106)
處.
解決方案
通過線程堆棧日誌分析可以定位到線程夯住是因為HttpClient在執行訪問時被阻塞了,結合源代碼找到阻塞原因是因為未設置請求超時時間.
上述問題本質上是因為HttpClient組件並未設置請求超時控制導致的:雖然連接超時,但是讀取失敗,導致線程一直被阻塞.
那麽,應該如何設置HttpClient的超時時間呢?鑒於HttpClient的官方文檔沒有明確說明,並且不同版本的HttpClient組件設置超時控制的方式不一致,所以建議直接查看源碼.
HttpClient執行訪問請求時序圖如下:
順藤摸瓜,在MainClientExec.java
的execute()
方法中看到有2處使用了timeout參數,其含義各不相同:
(1)在獲取HttpClientConnection對象時需要讀取配置參數中的ConnectionRequestTimeout
值,該參數用於控制獲取連接的超時時間.
(2)獲取到HttpClientConnection對象之後讀取配置參數中的SocketTimeout
值,設置Socket超時時間.
顯然,這2個timeout參數都需要從RequestConfig對象中獲取.
既然找到了使用timeout參數的地方,下一步需要確定該參數是如何設置的.沿著HttpClient的請求時序圖路徑往回查找,在InternalHttpClient.java
類的doExecute()
方法中可以很清晰地看到設置了RequestConfig對象參數.
@Override
protected CloseableHttpResponse doExecute(
final HttpHost target,
final HttpRequest request,
final HttpContext context) throws IOException, ClientProtocolException {
Args.notNull(request, "HTTP request");
HttpExecutionAware execAware = null;
if (request instanceof HttpExecutionAware) {
execAware = (HttpExecutionAware) request;
}
try {
final HttpRequestWrapper wrapper = HttpRequestWrapper.wrap(request, target);
final HttpClientContext localcontext = HttpClientContext.adapt(
context != null ? context : new BasicHttpContext());
// 通過RequestConfig對象配置連接參數
RequestConfig config = null;
if (request instanceof Configurable) {
// 如果在HttpRequest對象中設置了RequestConfig屬性,直接使用
config = ((Configurable) request).getConfig();
}
if (config == null) {
// 如果在HttpRequest對象中未設置RequestConfig對象屬性,則獲取HttpParams屬性構造RequestConfig對象
final HttpParams params = request.getParams();
if (params instanceof HttpParamsNames) {
if (!((HttpParamsNames) params).getNames().isEmpty()) {
config = HttpClientParamConfig.getRequestConfig(params);
}
} else {
config = HttpClientParamConfig.getRequestConfig(params);
}
}
if (config != null) {
// 使用RequestConfig對象配置連接參數
localcontext.setRequestConfig(config);
}
setupContext(localcontext);
final HttpRoute route = determineRoute(target, wrapper, localcontext);
return this.execChain.execute(route, wrapper, localcontext, execAware);
} catch (final HttpException httpException) {
throw new ClientProtocolException(httpException);
}
}
(3)HttpClient默認使用的連接池為PoolingHttpClientConnectionManager
,在建立連接時(connect()方法
)會使用其中的SocketConfig
配置參數對Socket進行配置,如下所示:
- PoolingHttpClientConnectionManager.java
@Override
public void connect(
final HttpClientConnection managedConn,
final HttpRoute route,
final int connectTimeout,
final HttpContext context) throws IOException {
Args.notNull(managedConn, "Managed Connection");
Args.notNull(route, "HTTP route");
final ManagedHttpClientConnection conn;
synchronized (managedConn) {
final CPoolEntry entry = CPoolProxy.getPoolEntry(managedConn);
conn = entry.getConnection();
}
final HttpHost host;
if (route.getProxyHost() != null) {
host = route.getProxyHost();
} else {
host = route.getTargetHost();
}
final InetSocketAddress localAddress = route.getLocalSocketAddress();
SocketConfig socketConfig = this.configData.getSocketConfig(host);
if (socketConfig == null) {
// 使用配置參數SocketConfig
socketConfig = this.configData.getDefaultSocketConfig();
}
if (socketConfig == null) {
socketConfig = SocketConfig.DEFAULT;
}
this.connectionOperator.connect(
conn, host, localAddress, connectTimeout, socketConfig, context);
}
- DefaultHttpClientConnectionOperator.java
@Override
public void connect(
final ManagedHttpClientConnection conn,
final HttpHost host,
final InetSocketAddress localAddress,
final int connectTimeout,
final SocketConfig socketConfig,
final HttpContext context) throws IOException {
final Lookup<ConnectionSocketFactory> registry = getSocketFactoryRegistry(context);
final ConnectionSocketFactory sf = registry.lookup(host.getSchemeName());
if (sf == null) {
throw new UnsupportedSchemeException(host.getSchemeName() +
" protocol is not supported");
}
final InetAddress[] addresses = host.getAddress() != null ?
new InetAddress[] { host.getAddress() } : this.dnsResolver.resolve(host.getHostName());
final int port = this.schemePortResolver.resolve(host);
for (int i = 0; i < addresses.length; i++) {
final InetAddress address = addresses[i];
final boolean last = i == addresses.length - 1;
Socket sock = sf.createSocket(context);
// 使用socketConfig參數中的超時時間對Socket進行配置
sock.setSoTimeout(socketConfig.getSoTimeout());
sock.setReuseAddress(socketConfig.isSoReuseAddress());
sock.setTcpNoDelay(socketConfig.isTcpNoDelay());
sock.setKeepAlive(socketConfig.isSoKeepAlive());
if (socketConfig.getRcvBufSize() > 0) {
sock.setReceiveBufferSize(socketConfig.getRcvBufSize());
}
if (socketConfig.getSndBufSize() > 0) {
sock.setSendBufferSize(socketConfig.getSndBufSize());
}
final int linger = socketConfig.getSoLinger();
if (linger >= 0) {
sock.setSoLinger(true, linger);
}
conn.bind(sock);
// ...
}
}
經過源碼解讀可以很明確地知道,在HttpClient 4.5.2版本中,設置連接參數有3種方式:
(1)在HttpRequest對象中設置RequestConfig
對象屬性
(2)在HttpRequest對象中設置HttpParams
對象屬性.
(3)在連接池對象中設置SocketConfig
對象屬性
既然找到了根源,下面分別通過這3種方式設置超時參數進行驗證.
方式1: 通過RequestConfig對象設置超時參數
int timeOut = 5000;
RequestConfig requestConfig = RequestConfig.custom()
.setConnectionRequestTimeout(timeOut) // 獲取連接超時時間
.setConnectTimeout(timeOut) // 設置HTTP連接超時時間
.setSocketTimeout(timeOut) // 設置Socket超時時間
.build();
request.setConfig(requestConfig);
方式2: 通過HttpParams對象設置超時參數
int timeOut = 5000;
HttpParams params = new BasicHttpParams();
params.setParameter(CoreConnectionPNames.SO_TIMEOUT, timeOut); // 設置Socket超時時間
params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, timeOut); // 設置HTTP連接超時時間
request.setParams(params);
方式3: 通過連接池對象設置超時參數
int timeOut = 5000;
PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager();
// 對連接池設置SocketConfig對象
connManager.setDefaultSocketConfig(SocketConfig.custom().setSoTimeout(timeOut).build());
client = HttpClients.custom().setConnectionManager(connManager).build();
通過上述3種方式分別設置超時參數驗證,雖然在運行過程中會有報錯,但是不會導致線程被阻塞,進程能正常運行結束:
解決問題之後客戶端CPU使用率恢復正常:
總結/教訓/反思
- 之所以會遇到這樣的問題,還是因為對HttpClient組件不熟悉導致的;另外,在發現問題之後如何快速定位問題,並搜索穩定的解決方案很重要。
- HttpClient組件每個版本的API變化都比較大,在使用時一定要徹底清楚當前使用的版本是如何設置超時時間的。而如何確定知道超時時間控制,通過源代碼查看最為妥當.
在Java平臺使用Http客戶端組件,可以有多個選擇:
(1)直接使用JDK提供的URL類訪問
(2)使用HttpClient組件,有坑,不同版本設置參數的方式變動較大,最好是閱讀一下當前使用版本的源碼實現,正確設置好超時時間等參數
(3)如果使用了Spring MVC框架,還可以使用Spring MVC提供的RestTemplate組件,底層是使用Netty實現的客戶端遇到的這個坑本身並不屬於技術難點,但是面對這個問題的解決思路值得總結:
(1)程序日誌,運行日誌非常關鍵,是定位問題時第一時間需要查看的
(2)代碼review,逐行逐行地審查,首先排除可能存在的代碼邏輯問題,比如:死鎖等
(3)通過jstack命令查看線程堆棧信息:jstack -l -F <pid> > stack.log
(4)通過jmap命令查看內存堆棧信息:jmap -dump:live format=b,file=heap.bin <pid>
(5)如果結合搜索引擎和上述排查步驟依然未能解決問題,應該第一時間考慮直接閱讀組件的源代碼實現,特別是使用了開源組件時這可能才是真正解決問題的最佳路徑
【參考】
https://blog.csdn.net/u011191463/article/details/78664896 HttpClient超時設置詳解
https://my.oschina.net/jywm/blog/1834702 解決httpclient超時設置不生效的問題
https://www.jianshu.com/p/4b3e172c4f2d HttpClient 4.5.2-(四)連接超時的配置
https://www.jianshu.com/p/6a41c95855e3 HttpClient 4.5.2-(五)連接池的配置
https://www.jianshu.com/p/c852cbcf3d68 HttpClient高並發下性能優化-http連接池
https://blog.csdn.net/u011402596/article/details/44619443 HttpClient 多線程處理
https://field-notes.iteye.com/blog/2383759 多線程消費使用HttpClient引發的坑
http://blog.51cto.com/lihao001/1788490 httpclient4.3 導致線程阻塞
https://study121007.iteye.com/blog/2304274 HttpClient4.5.2 連接管理
https://www.jianshu.com/p/c852cbcf3d68 HttpClient高並發下性能優化-http連接池
https://alafqq.iteye.com/blog/2325041 httpclient 多線程執行(網上版本太多了。。。誤人子弟)
https://gaozzsoft.iteye.com/blog/2352241 HttpClient 4.5.2版本設置連接超時時間-CloseableHttpClient設置Timeout
https://www.cnblogs.com/softidea/p/6964347.html HttpClient 專題
https://blog.csdn.net/Fhaohaizi/article/details/78217903 httpclient4.5如何確保資源釋放
https://blog.csdn.net/u010634066/article/details/83120122 一場HttpClient調用未關閉流引發的問題
https://www.cnblogs.com/mumuxinfei/p/5066633.html Apache HttpClient使用之阻塞陷阱
https://issues.apache.org/jira/browse/HTTPCLIENT-1584 CloseableHttpClient - SSL Handshake has no Socket Timeout
https://monkeyissexy.github.io/2016/11/11/httpclient_ssl_handshake_socketTimeout_bug/ httpclient ssl handshake socketTimeout bug 分析解決過程
http://geekerwang.com/2017/10/22/%E8%AE%B0%E5%BD%95%E4%B8%80%E6%AC%A1%E5%9D%91%EF%BC%9AsocketRead-hang/ 記錄一次坑:socketRead hang
https://www.cnblogs.com/jessezeng/p/7448636.html 解決: httpclient ssl 驗證導致死鎖問題
http://itbang.me/solu/detail/201 JAVA線程卡死問題如何定位?
HttpClient在多線程環境下踩坑總結