一次JVM調優實戰

阿新 • • 發佈：2019-01-15

早上檢視低峰期gc情況，發現昨晚上fgc有274次，感覺有些不正常，開始查詢原因

[[email protected]order-binlog-data002 spring-boot]$ sudo jstat -gc 20028 4000 10
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT
283328.0 283328.0 1036.4  0.0   2266752.0 589773.4 1700864.0   52111.5   41476.0 40494.4 4672.0 4507.7   5998  157.033  274    37.794  194.827
283328.0 283328.0 1036.4  0.0   2266752.0 1902667.1 1700864.0   52111.5   41476.0 40494.4 4672.0 4507.7   5998  157.033  274    37.794  194.827
283328.0 283328.0  0.0   906.8  2266752.0 1165571.6 1700864.0   52125.8   41476.0 40494.4 4672.0 4507.7   5999  157.037  274    37.794  194.831
283328.0 283328.0 588.3   0.0   2266752.0 419124.4 1700864.0   52222.6   41476.0 40494.4 4672.0 4507.7   6000  157.041  274    37.794  194.834
283328.0 283328.0 588.3   0.0   2266752.0 1858587.9 1700864.0   52222.6   41476.0 40494.4 4672.0 4507.7   6000  157.041  274    37.794  194.834
283328.0 283328.0  0.0   980.7  2266752.0 1015355.0 1700864.0   52311.3   41476.0 40494.4 4672.0 4507.7   6001  157.046  274    37.794  194.839
283328.0 283328.0 1075.6  0.0   2266752.0 376459.3 1700864.0   52369.0   41476.0 40494.4 4672.0 4507.7   6002  157.050  274    37.794  194.843
283328.0 283328.0 1075.6  0.0   2266752.0 1725419.9 1700864.0   52369.0   41476.0 40494.4 4672.0 4507.7   6002  157.050  274    37.794  194.843
283328.0 283328.0  0.0   650.2  2266752.0 993727.1 1700864.0   52471.9   41476.0 40494.4 4672.0 4507.7   6003  157.053  274    37.794  194.847
283328.0 283328.0 1402.2  0.0   2266752.0 203814.7 1700864.0   52553.9   41476.0 40494.4 4672.0 4507.7   6004  157.058  274    37.794  194.851

中午高峰期檢視，老年代一直在增長，每4秒增長大概1M

[[email protected]order-binlog-data002 spring-boot]$ sudo jstat -gc 20028 4000 10
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT
283328.0 283328.0 3849.7  0.0   2266752.0 510658.3 1700864.0   80623.6   41476.0 40494.4 4672.0 4507.7   6396  158.987  274    37.794  196.780
283328.0 283328.0  0.0   2788.4 2266752.0 477917.0 1700864.0   80689.9   41476.0 40494.4 4672.0 4507.7   6397  158.991  274    37.794  196.784
283328.0 283328.0 1931.9  0.0   2266752.0 474958.2 1700864.0   80791.0   41476.0 40494.4 4672.0 4507.7   6398  158.995  274    37.794  196.788
283328.0 283328.0  0.0   2145.2 2266752.0 483570.1 1700864.0   80877.9   41476.0 40494.4 4672.0 4507.7   6399  158.999  274    37.794  196.793
283328.0 283328.0 2864.5  0.0   2266752.0 638776.6 1700864.0   80953.9   41476.0 40494.4 4672.0 4507.7   6400  159.004  274    37.794  196.797
283328.0 283328.0  0.0   1856.1 2266752.0 880657.2 1700864.0   81013.6   41476.0 40494.4 4672.0 4507.7   6401  159.009  274    37.794  196.802
283328.0 283328.0 4210.8  0.0   2266752.0 894710.5 1700864.0   81099.8   41476.0 40494.4 4672.0 4507.7   6402  159.014  274    37.794  196.808
283328.0 283328.0  0.0   2900.8 2266752.0 960077.2 1700864.0   81210.0   41476.0 40494.4 4672.0 4507.7   6403  159.018  274    37.794  196.812
283328.0 283328.0 1813.5  0.0   2266752.0 1064101.3 1700864.0   81266.9   41476.0 40494.4 4672.0 4507.7   6404  159.023  274    37.794  196.817
283328.0 283328.0  0.0   1498.7 2266752.0 1126030.4 1700864.0   81323.5   41476.0 40494.4 4672.0 4507.7   6405  159.028  274    37.794  196.822

檢視昨晚gc日誌發現，age1和age2表示新生代年齡為1和2，但是有個地方只有age1沒有age2，緊接著就開始CMS收集，感覺有些線索。首先簡單介紹一些該應用，大多數是物件都是朝生夕死的物件，不太可能進入老年代。開始懷疑是年齡晉升閾值太小，只配置了2。本該在新生代回收的物件，由於沒有及時回收，從age1直接到了老年代。

2017-07-12T20:54:06.586+0800: 80.309: [GC (Allocation Failure) 2017-07-12T20:54:06.586+0800: 80.309: [ParNew
Desired survivor size 145063936 bytes, new 
 threshold 2 (max 2)
- age   1:   33586544 bytes,   33586544 total
- age   2:   35816528 bytes,   69403072 total
: 2355435K->78957K(2550080K), 0.0317835 secs] 3532169K->1284299K(4250944K), 0.0319476 secs] [Times: user=0.12 sys=0.00, real=0.03 secs]
2017-07-12T20:54:06.618+0800: 80.341: Total time for which application threads were stopped: 0.0325909 seconds, Stopping threads took: 0.0001534 seconds
2017-07-12T20:54:07.620+0800: 81.342: Total time for which application threads were stopped: 0.0006056 seconds, Stopping threads took: 0.0001820 seconds
2017-07-12T20:54:08.384+0800: 82.107: [GC (Allocation Failure) 2017-07-12T20:54:08.384+0800: 82.107: [ParNew
Desired survivor size 145063936 bytes, new threshold 2 (max 2)
- age   1:   56321632 bytes,   56321632 total
- age   2:   31853384 bytes,   88175016 total
: 2345709K->116591K(2550080K), 0.0398190 secs] 3551051K->1356910K(4250944K), 0.0399864 secs] [Times: user=0.14 sys=0.00, real=0.04 secs]
2017-07-12T20:54:08.424+0800: 82.147: Total time for which application threads were stopped: 0.0406831 seconds, Stopping threads took: 0.0001112 seconds
2017-07-12T20:54:10.085+0800: 83.808: [GC (Allocation Failure) 2017-07-12T20:54:10.085+0800: 83.808: [ParNew
Desired survivor size 145063936 bytes, new threshold 1 (max 2)
- age   1:  190948400 bytes,  190948400 total
- age   2:   54471392 bytes,  245419792 total
: 2383343K->264021K(2550080K), 0.0715712 secs] 3623662K->1535443K(4250944K), 0.0717427 secs] [Times: user=0.27 sys=0.00, real=0.07 secs]
2017-07-12T20:54:10.157+0800: 83.880: Total time for which application threads were stopped: 0.0724583 seconds, Stopping threads took: 0.0001708 seconds
2017-07-12T20:54:11.595+0800: 85.318: [GC (Allocation Failure) 2017-07-12T20:54:11.595+0800: 85.318: [ParNew
Desired survivor size 145063936 bytes, new threshold 1 (max 2)
- age   1:  188703680 bytes,  188703680 total
: 2530773K->250075K(2550080K), 0.1668482 secs] 3802195K->1753036K(4250944K), 0.1670176 secs] [Times: user=0.52 sys=0.04, real=0.17 secs]
2017-07-12T20:54:11.762+0800: 85.485: Total time for which application threads were stopped: 0.1676754 seconds, Stopping threads took: 0.0000755 seconds
2017-07-12T20:54:11.765+0800: 85.488: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1502960K(1700864K)] 1771515K(4250944K), 0.0158729 secs] [Times: user=0.06 sys=0.00, real=0.02 secs]
2017-07-12T20:54:11.781+0800: 85.504: Total time for which application threads were stopped: 0.0167987 seconds, Stopping threads took: 0.0002961 seconds
2017-07-12T20:54:11.781+0800: 85.504: [CMS-concurrent-mark-start]
2017-07-12T20:54:12.050+0800: 85.773: [CMS-concurrent-mark: 0.269/0.269 secs] [Times: user=0.70 sys=0.11, real=0.26 secs]
2017-07-12T20:54:12.050+0800: 85.773: [CMS-concurrent-preclean-start]
2017-07-12T20:54:12.055+0800: 85.778: [CMS-concurrent-preclean: 0.005/0.005 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2017-07-12T20:54:12.055+0800: 85.778: [CMS-concurrent-abortable-preclean-start]
2017-07-12T20:54:12.055+0800: 85.778: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2017-07-12T20:54:12.056+0800: 85.778: [GC (CMS Final Remark) [YG occupancy: 631012 K (2550080 K)]2017-07-12T20:54:12.056+0800: 85.778: [Rescan (parallel) , 0.0752694 secs]2017-07-12T20:54:12.131+0800: 85.854: [weak refs processing, 0.0000552 secs]2017-07-12T20:54:12.131+0800: 85.854: [class unloading, 0.0092330 secs]2017-07-12T20:54:12.140+0800: 85.863: [scrub symbol table, 0.0050382 secs]2017-07-12T20:54:12.145+0800: 85.868: [scrub string table, 0.0010191 secs][1 CMS-remark: 1502960K(1700864K)] 2133973K(4250944K), 0.0924674 secs] [Times: user=0.31 sys=0.00, real=0.09 secs]
2017-07-12T20:54:12.148+0800: 85.871: Total time for which application threads were stopped: 0.0930845 seconds, Stopping threads took: 0.0001297 seconds
2017-07-12T20:54:12.148+0800: 85.871: [CMS-concurrent-sweep-start]
2017-07-12T20:54:12.612+0800: 86.335: [CMS-concurrent-sweep: 0.464/0.464 secs] [Times: user=1.21 sys=0.26, real=0.47 secs]
2017-07-12T20:54:12.612+0800: 86.335: [CMS-concurrent-reset-start]
2017-07-12T20:54:12.617+0800: 86.339: [CMS-concurrent-reset: 0.005/0.005 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
2017-07-12T20:54:13.595+0800: 87.318: [GC (Allocation Failure) 2017-07-12T20:54:13.595+0800: 87.318: [ParNew
Desired survivor size 145063936 bytes, new threshold 1 (max 2)
- age   1:  158235824 bytes,  158235824 total
: 2516827K->235282K(2550080K), 0.1279312 secs] 3152007K->1049527K(4250944K), 0.1281230 secs] [Times: user=0.43 sys=0.03, real=0.13 secs]
2017-07-12T20:54:13.723+0800: 87.446: Total time for which application threads were stopped: 0.1288954 seconds, Stopping threads took: 0.0001631 seconds
2017-07-12T20:54:15.493+0800: 89.216: [GC (Allocation Failure) 2017-07-12T20:54:15.493+0800: 89.216: [ParNew
Desired survivor size 145063936 bytes, new threshold 2 (max 2)
- age   1:   82668368 bytes,   82668368 total
: 2502034K->203689K(2550080K), 0.1076523 secs] 3316279K->1170879K(4250944K), 0.1078419 secs] [Times: user=0.35 sys=0.02, real=0.10 secs]
2017-07-12T20:54:15.601+0800: 89.324: Total time for which application threads were stopped: 0.1085887 seconds, Stopping threads took: 0.0001270 seconds
2017-07-12T20:54:17.307+0800: 91.029: [GC (Allocation Failure) 2017-07-12T20:54:17.307+0800: 91.029: [ParNew
Desired survivor size 145063936 bytes, new threshold 2 (max 2)
- age   1:   77960664 bytes,   77960664 total
- age   2:   65094536 bytes,  143055200 total
: 2470441K->182761K(2550080K), 0.0338754 secs] 3437631K->1149951K(4250944K), 0.0340508 secs] [Times: user=0.13 sys=0.00, real=0.03 secs]
2017-07-12T20:54:17.341+0800: 91.064: Total time for which application threads were stopped: 0.0348178 seconds, Stopping threads took: 0.0001475 seconds

找到原因後，就開始做調整。首先為了儘可能把物件留到新生代回收，需要提高Survivor的利用率。有個TargetSurvivorRatio引數，預設是50，表示利用率超過50%就將物件送入老年代。還有物件晉升閾值MaxTenuringThreshold，調整為最大。考慮到機器記憶體為8G，而運維預設設定最大堆為4.4G,還有可利用空間，即將最大堆調整為6G。

-Xmx6000m
-Xms6000m
-Xmn4000m
-XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=15

調整後gc後，第二天早上檢視gc情況，發現昨晚上沒有一次fgc，而且老年代增長非常緩慢，雖然不能完全避免fgc，但是已經將fgc減少，並且延後。

sudo jstat -gc 31703 5000 10
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT
350976.0 350976.0  0.0   1974.1 2808320.0 686899.3 2048000.0   85810.0   41856.0 41050.9 4736.0 4544.5   4017   40.907   0      0.000   40.907
350976.0 350976.0 1834.3  0.0   2808320.0 623932.5 2048000.0   85816.7   41856.0 41050.9 4736.0 4544.5   4018   40.916   0      0.000   40.916
350976.0 350976.0  0.0   2378.6 2808320.0 426285.4 2048000.0   85832.0   41856.0 41050.9 4736.0 4544.5   4019   40.926   0      0.000   40.926
350976.0 350976.0 2783.9  0.0   2808320.0 556281.0 2048000.0   85858.5   41856.0 41050.9 4736.0 4544.5   4020   40.935   0      0.000   40.935
350976.0 350976.0  0.0   2848.8 2808320.0 609204.5 2048000.0   85893.7   41856.0 41050.9 4736.0 4544.5   4021   40.944   0      0.000   40.944
350976.0 350976.0 2076.1  0.0   2808320.0 735427.5 2048000.0   85989.6   41856.0 41050.9 4736.0 4544.5   4022   40.953   0      0.000   40.953
350976.0 350976.0  0.0   2687.3 2808320.0 685107.7 2048000.0   86028.1   41856.0 41050.9 4736.0 4544.5   4023   40.963   0      0.000   40.963
350976.0 350976.0 3439.6  0.0   2808320.0 507868.2 2048000.0   86074.3   41856.0 41050.9 4736.0 4544.5   4024   40.972   0      0.000   40.972
350976.0 350976.0  0.0   2130.9 2808320.0 950108.9 2048000.0   86145.5   41856.0 41050.9 4736.0 4544.5   4025   40.982   0      0.000   40.982
350976.0 350976.0 3430.0  0.0   2808320.0 917396.6 2048000.0   86145.5   41856.0 41050.9 4736.0 4544.5   4026   40.992   0      0.000   40.992

一次JVM調優實戰

一次JVM調優實戰

記一次JVM調優（Permanent Generation）

一次jvm調優過程

《JVM調優實戰-理論篇》

一次weblogic調優的經過(StuckThreadMaxTime) of "600" seconds) .

jvm調優實戰，定位效能瓶頸

spark2.x-jvm調優實戰（以tomcat訪問日誌分析為例）

記一次SQL調優

JVM 性能調優實戰之：一次系統性能瓶頸的尋找過程

JVM 效能調優實戰之：一次系統性能瓶頸的尋找過程

JVM GC一次調優實戰

一次線上JVM調優實踐，FullGC40次/天到10天一次的優化過程

一次jVM效能調優記錄

JVM調優總結（十一）-反思

ifeve.com 南方《JVM 性能調優實戰之：使用阿裏開源工具 TProfiler 在海量業務代碼中精確定位性能代碼》

ifeve.com 南方《JVM 效能調優實戰之：使用阿里開源工具 TProfiler 在海量業務程式碼中精確定位效能程式碼》

JVM調優（一）虛擬機器的記憶體模型

JVM 效能調優實戰之使用阿里開源工具 TProfiler 在海量業務程式碼中精確定位效能程式碼

JVM調優總結（一）：基本概念

JVM調優大全及實戰總結

一次JVM調優實戰

相關推薦