ali的ons mq執行一段時間後消費下降並導致堆積問題查驗
1:問題現象:
執行的instance一段時間(20h)就下降,重啟之後消費正常然後又不行了;原以為是ons版本1.2.7改成laest1.7.7.final;沒效果;經驗之覺:肯定是程式碼沒優化好:
處理流程一:單純以為應該是gc沒做好;有big Object ;./jmap發現了MsgContent;查project使用 ConcurrentHashMap<String ,MsgContent>一直add沒有remove;so 新增remove並且就加上value = null;利於gc發現沒太大效果;
public static ConcurrentHashMap<String ,MsgContent> map = new ConcurrentHashMap<String ,MsgContent>(); //遍歷map中的value,然後檢視value中的time值是不是超過了兩分鐘,是的話就刪除掉對應的key public static void removeInvalidKey(ConcurrentHashMap<String,MsgContent> map){ for (MsgContent value : map.values()) { if (System.currentTimeMillis()-value.getTime() > 2 * 60 * 1000) { MsgMatch.map.remove(value.getUid()); value = null;//強制把物件設定null,check object被gc回收(System.gc()) } } }
num #instances #bytes class name ---------------------------------------------- 1: 651850 208798320 [C 2: 651267 15630408 java.lang.String 3: 71571 10226008 <constMethodKlass> 4: 71571 9172944 <methodKlass> 5: 6020 6965584 <constantPoolKlass> 6: 20793 5553840 [I 7: 153195 4902240 java.util.HashMap$Entry 8: 24879 4784448 [B 9: 189633 4551192 java.util.concurrent.ConcurrentLinkedDeque$Node 10: 6020 4496624 <instanceKlassKlass> 11: 5076 4044384 <constantPoolCacheKlass> 12: 78356 2507392 java.util.concurrent.ConcurrentHashMap$HashEntry 13: 64274 2506768 com.xxx.xxx.access.mysql.entity.MsgContent
處理流程二:經過流程一;instance能正常跑(30h),還沒找到病原體;沒辦法去找thread Stack快照:發現執行緒runable一個地方(這時jvm已經小露病源了)如圖:
"ConsumeMessageThread_7" prio=10 tid=0x00007f6498008000 nid=0x43 runnable [0x00007f6558c82000] java.lang.Thread.State: RUNNABLE at java.util.concurrent.ConcurrentLinkedDeque.contains(ConcurrentLinkedDeque.java:1085) at com.xxx.xxxx.access.alimq.EvMsgRtListener.consume(EvMsgRtListener.java:169) at com.aliyun.openservices.ons.api.impl.rocketmq.ConsumerImpl$MessageListenerImpl.consumeMessage(ConsumerImpl.java:97) at com.aliyun.openservices.shade.com.alibaba.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService$ConsumeRequest.run(ConsumeMessageConcurrentlyService.java:417) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - <0x000000070841ea38> (a java.util.concurrent.ThreadPoolExecutor$Worker) "ConsumeMessageThread_5" prio=10 tid=0x00007f6498004000 nid=0x42 runnable [0x00007f6558d83000] java.lang.Thread.State: RUNNABLE at java.util.concurrent.ConcurrentLinkedDeque.contains(ConcurrentLinkedDeque.java:1085) at com.xxx.xxxx.access.alimq.EvMsgRtListener.consume(EvMsgRtListener.java:169) at com.aliyun.openservices.ons.api.impl.rocketmq.ConsumerImpl$MessageListenerImpl.consumeMessage(ConsumerImpl.java:97) at com.aliyun.openservices.shade.com.alibaba.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService$ConsumeRequest.run(ConsumeMessageConcurrentlyService.java:417) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - <0x000000070841f708> (a java.util.concurrent.ThreadPoolExecutor$Worker)
程式碼此處:
此處的queue是一個定時任務;涉及到遍歷及remove key操作,因為ConcurrentLinkedDeque此處操作會嚴重拖耗效能,每一次重構需要重新排序;詳細參考JAVA集合框架中的常用集合及其特點、適用場景、實現原理簡介
此時問題發現註釋解決:總結一下:之前多次遇到過同樣場景:執行一段時間cpu飆升;消費能力下降;:也是涉及到遠端呼叫http SocketTimeout(5000) ---》5000修改為1s;縮短時間,避免長時間進行響應阻塞,thread執行
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpPost http = new HttpPost(url);
/**
* setConnectTimeout:設定連線超時時間,單位毫秒。
* setConnectionRequestTimeout:設定從connect Manager獲取Connection 超時時間,單位毫秒
* setSocketTimeout:請求獲取資料的超時時間,單位毫秒。 如果訪問一個介面,多少時間內無法返回資料,就直接放棄此次呼叫
*/
RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(5000).setConnectionRequestTimeout(1000)
.setSocketTimeout(5000).build();
http.setConfig(requestConfig);
HttpEntity inEntity = EntityBuilder.create().setText(json).setContentType(ContentType.APPLICATION_JSON).build();
http.setEntity(inEntity);
CloseableHttpResponse response = httpclient.execute(http);
ps:提到這快取;設計快取要清楚各個元件效能及優缺點:
簡單一點用hashMap;上文就提到清理無效的資料時;如何徹底gc防止資料過多導致溢位;一個好的替代方案是weakHashMap;是使用弱引用維護一張雜湊表;but 作為專業快取,功能上略有不足;詳見:WeakHashMap和HashMap的區別;更詳細的:話說ReferenceQueue