1. 程式人生 > >ESdata節點脫離集群,系統日誌報120秒超時

ESdata節點脫離集群,系統日誌報120秒超時

lte filter works 0x13 common vat upload quest log

ES信息:Centos7.2,ES6.2.2 , MASTER:16核/128G物理 * 3 ,DATA:16核/128G/12塊HDD6T組成RAID0 * 40, JVM開了30G, 目前只有一個索引,每天10T(算上副本),分片160,副本1,保留7天】
故障描述:某一個節點(隨機)總是無緣無故的脫離集群,節點load標高,100以上,敲命令都會卡住,只有強制重啟才可以解決,加force_merge後更為嚴重,;

問題背景:之前基本一個月內會出現一次上述的問題吧,前陣子我加了一個每天淩晨1點開始執行force_merge=1定時任務,每次基本12小時左右才能完成,加劇了上述問題的出現,但這個基本是在淩晨4-6點出現故障比較多,一周內至少出現一次或多次,導致集群寫入嚴重下降,屬於半不可用狀態(寫入堆積,非實時數據),當時是加了merge開始問題急劇出現,經過幾天排查無果,後來因為對歷史數據查詢需求不大,便關了這個定時任務,但是這個問題根本一直沒解決,

目前有兩個問題:

1、為什麽會出現脫離集群的問題呢,而且現在時不時的出現,出現時間沒有規律性?
2、某一個節點脫離後,整個集群吞吐量下降嚴重,從原來寫入qps 70w+ 為什麽會降到了30w左右呢?

排除硬件問題,重啟後就恢復,而且找過系統部的同學看過沒有硬件報警,希望有遇到過或者有排查思路的給一些建議或意見,以下是我收集的信息

信息一:在出現問題的當時(22:52),/vat/log/messages大量日誌如下:
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994585] INFO: task java:104611 blocked for more than 120 seconds.
Aug 4 22:52:54 tjtx135-6-226 kernel: [
4981123.994630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994682] java D ffffffffffffffff 0 104611 1 0x00000100
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994685] ffff88013f05fc20 0000000000000082 ffff88001e6ee780 ffff88013f05ffd8
Aug 4 22:52:54 tjtx135-6-226 kernel: [
4981123.994691] ffff88013f05ffd8 ffff88013f05ffd8 ffff88001e6ee780 ffff88013f05fd68
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994696] ffff88013f05fd70 7fffffffffffffff ffff88001e6ee780 ffffffffffffffff
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994701] Call Trace:
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994706] [<ffffffff8163a909>] schedule+0x29/0x70
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994710] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994715] [<ffffffff8101c829>] ? read_tsc+0x9/0x10
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994720] [<ffffffff810d814c>] ? ktime_get_ts64+0x4c/0xf0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994723] [<ffffffff8112882f>] ? delayacct_end+0x8f/0xb0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994728] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994733] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994737] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994741] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994768] [<ffffffffa03a143a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994793] [<ffffffffa039fa7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994798] [<ffffffff81639b12>] ? down_read+0x12/0x30
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994823] [<ffffffffa03824d0>] xfs_file_fsync+0x1b0/0x200 [xfs]
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994829] [<ffffffff8120f975>] do_fsync+0x65/0xa0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994834] [<ffffffff8120fc63>] SyS_fdatasync+0x13/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994839] [<ffffffff81645b12>] tracesys+0xdd/0xe2
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994854] INFO: task java:67513 blocked for more than 120 seconds.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994898] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994951] java D ffff88001f8128a8 0 67513 1 0x00000100
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994954] ffff880054a63c20 0000000000000082 ffff880116971700 ffff880054a63fd8
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994959] ffff880054a63fd8 ffff880054a63fd8 ffff880116971700 ffff88001f8128a0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994964] ffff88001f8128a4 ffff880116971700 00000000ffffffff ffff88001f8128a8
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994970] Call Trace:
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994975] [<ffffffff8163b9e9>] schedule_preempt_disabled+0x29/0x70
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994979] [<ffffffff816396e5>] __mutex_lock_slowpath+0xc5/0x1c0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994983] [<ffffffff811e8a87>] ? unlazy_walk+0x87/0x140
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994987] [<ffffffff81638b4f>] mutex_lock+0x1f/0x2f
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994992] [<ffffffff8163251e>] lookup_slow+0x33/0xa7
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994996] [<ffffffff811edf13>] path_lookupat+0x773/0x7a0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995001] [<ffffffff811c0e65>] ? kmem_cache_alloc+0x35/0x1d0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995005] [<ffffffff811eec0f>] ? getname_flags+0x4f/0x1a0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995008] [<ffffffff811edf6b>] filename_lookup+0x2b/0xc0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995013] [<ffffffff811efd37>] user_path_at_empty+0x67/0xc0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995018] [<ffffffff81101072>] ? from_kgid_munged+0x12/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995023] [<ffffffff811e3aef>] ? cp_new_stat+0x14f/0x180
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995027] [<ffffffff811efda1>] user_path_at+0x11/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995032] [<ffffffff811e35e3>] vfs_fstatat+0x63/0xc0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995036] [<ffffffff811e3bb1>] SYSC_newlstat+0x31/0x60
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995042] [<ffffffff810222fd>] ? syscall_trace_enter+0x17d/0x220
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995047] [<ffffffff81645ab3>] ? tracesys+0x7e/0xe2
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995052] [<ffffffff811e3e3e>] SyS_newlstat+0xe/0x10
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995056] [<ffffffff81645b12>] tracesys+0xdd/0xe2

根據已上報錯,搜索的結論ung_task_timeout_secs和blocked for more than 120 seconds的解決方法,改了推薦的參數,問題還是依舊出現


Linux系統出現hung_task_timeout_secs和blocked for more than 120 seconds的解決方法

Linux系統出現系統沒有響應。 在/var/log/message日誌中出現大量的 “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.” 和“blocked for more than 120 seconds”錯誤。

問題原因:

默認情況下, Linux會最多使用40%的可用內存作為文件系統緩存。當超過這個閾值後,文件系統會把將緩存中的內存全部寫入磁盤, 導致後續的IO請求都是同步的。將緩存寫入磁盤時,有一個默認120秒的超時時間。 出現上面的問題的原因是IO子系統的處理速度不夠快,不能在120秒將緩存中的數據全部寫入磁盤。IO系統響應緩慢,導致越來越多的請求堆積,最終系統內存全部被占用,導致系統失去響應。

解決方法:

根據應用程序情況,對vm.dirty_ratio,vm.dirty_background_ratio兩個參數進行調優設置。 例如,推薦如下設置:
# sysctl -w vm.dirty_ratio=10
# sysctl -w vm.dirty_background_ratio=5
# sysctl -p

如果系統永久生效,修改/etc/sysctl.conf文件。加入如下兩行:
#vi /etc/sysctl.confvm.dirty_background_ratio = 5 vm.dirty_ratio = 10重啟系統生效。




【根據博客結論,聯想到可能跟可用內存作為文件系統緩存有關系,但是我看了故障前後的系統對比監控,在故障發生之前並沒有表現出什麽異常
,下面的幾張圖是(正常節點:10.135.6.227)與(異常節點:10.135.6.226節點)監控對比圖



內存:

技術分享圖片

IO:

技術分享圖片

LOAD:

技術分享圖片


還有個kernel故障點不一致,其他兩者基本沒什麽差距

技術分享圖片


還有幾張JVM的監控

技術分享圖片

技術分享圖片

技術分享圖片

技術分享圖片

技術分享圖片

技術分享圖片



master節點日誌:

技術分享圖片


故障節點log日誌
[2018-08-04T06:49:12,265][WARN ][o.e.m.j.JvmGcMonitorService] [10.135.6.226] [gc][young][1013831][93448] duration [1.1s], collections [1]/[7s], total [1.1s]/[1.2h], memory [22.8gb]->[9.4gb]/[29.2gb], all_pools {[young] [6.1gb]->[1.9mb]/[6.4gb]}{[survivor] [633.8mb]->[0b]/[819.1mb]}{[old] [16.1gb]->[9.4gb]/[22gb]}
[2018-08-04T06:49:12,275][INFO ][o.e.m.j.JvmGcMonitorService] [10.135.6.226] [gc][old][1013831][1217] duration [5.1s], collections [1]/[7s], total [5.1s]/[4.3m], memory [22.8gb]->[9.4gb]/[29.2gb], all_pools {[young] [6.1gb]->[1.9mb]/[6.4gb]}{[survivor] [633.8mb]->[0b]/[819.1mb]}{[old] [16.1gb]->[9.4gb]/[22gb]}
[2018-08-04T06:49:12,275][WARN ][o.e.m.j.JvmGcMonitorService] [10.135.6.226] [gc][1013831] overhead, spent [6.3s] collecting in the last [7s]
[2018-08-04T22:51:04,451][ERROR][o.e.x.m.c.n.NodeStatsCollector] [10.135.6.226] collector [node_stats] timed out when collecting data
[2018-08-04T22:51:14,468][ERROR][o.e.a.b.TransportBulkAction] [10.135.6.226] failed to execute pipeline for a bulk request
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of [email protected] on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, [email protected]accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
[2018-08-04T22:51:14,473][WARN ][o.e.x.m.MonitoringService] [10.135.6.226] monitoring execution failed
org.elasticsearch.xpack.monitoring.exporter.ExportException: Exception when closing export bulk
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1$1.<init>(ExportBulk.java:107) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1.onFailure(ExportBulk.java:105) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:218) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:212) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:108) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:176) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$1(LocalBulk.java:127) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.lambda$processBulkIndexIngestRequest$4(TransportBulkAction.java:503) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService$2.onFailure(PipelineExecutionService.java:79) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.onRejection(AbstractRunnable.java:63) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onRejection(ThreadContext.java:662) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:104) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:168) ~[?:?]
... 41 more
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulk [default_local]
... 40 more
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of [email protected] on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, [email protected]accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
... 31 more
[2018-08-04T22:51:24,430][ERROR][o.e.a.b.TransportBulkAction] [10.135.6.226] failed to execute pipeline for a bulk request
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of [email protected] on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, [email protected]accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
[2018-08-04T22:51:24,434][WARN ][o.e.x.m.MonitoringService] [10.135.6.226] monitoring execution failed
org.elasticsearch.xpack.monitoring.exporter.ExportException: Exception when closing export bulk
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1$1.<init>(ExportBulk.java:107) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1.onFailure(ExportBulk.java:105) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:218) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:212) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:108) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:176) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$1(LocalBulk.java:127) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.lambda$processBulkIndexIngestRequest$4(TransportBulkAction.java:503) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService$2.onFailure(PipelineExecutionService.java:79) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.onRejection(AbstractRunnable.java:63) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onRejection(ThreadContext.java:662) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:104) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:168) ~[?:?]
... 41 more
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulk [default_local]
... 40 more
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of [email protected] on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, [email protected]accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
... 31 more
[2018-08-04T22:51:34,430][ERROR][o.e.a.b.TransportBulkAction] [10.135.6.226] failed to execute pipeline for a bulk request

ESdata節點脫離集群,系統日誌報120秒超時