Hive啟動留下的RunJar程序不能使用Kill -9 殺不掉怎麼辦?
1、問題示例
[Hadoop@master Logs]$ jps
3728 ResourceManager
6976 RunJar
7587 Jps
4277 Master
3095 NameNode
3863 NodeManager
3450 SecondaryNameNode
4362 Worker
3245 DataNode
[Hadoop@master Logs]$ kill -9 6976
[Hadoop@master Logs]$ jps
3728 ResourceManager
6976 RunJar
4277 Master
3095 NameNode
3863 NodeManager
7607 Jps
3450 SecondaryNameNode
4362 Worker
3245 DataNode
問題描述:不正常啟動Hive,留下的RunJar程序,通過不能成功kill掉,該程序變成殭屍程序。
2、問題剖析
參考:https://blog.csdn.net/walykyy/article/details/113253060
殭屍程序不能直接被kill掉,可從殭屍程序的父程序進行kill掉。
3、解決方案
找到殭屍程序,殭屍程序的標記符為:PPid.
按如下步驟進行:
[Hadoop@master Logs]$ cd /proc/6976
[Hadoop@master 6976]$ ls
ls: 無法讀取符號連結cwd: 許可權不夠
ls: 無法讀取符號連結root: 許可權不夠
ls: 無法讀取符號連結exe: 許可權不夠
attr coredump_filter gid_map mountinfo oom_score sched statm
autogroup cpuset io mounts oom_score_adj schedstat status
auxv cwd limits mountstats pagemap sessionid syscall
cgroup environ loginuid net patch_state setgroups task
clear_refs exe map_files ns personality smaps timers
cmdline fd maps numa_maps projid_map stack uid_map
comm fdinfo mem oom_adj root stat wchan
[Hadoo@master 6976]$ cat status
Name: java
State: Z (zombie)
Tgid: 6976
Ngid: 0
Pid: 6976
PPid: 6975
TracerPid: 0
Uid: 1001 1001 1001 1001
Gid: 1001 1001 1001 1001
FDSize: 0
Groups: 0 1001
Threads: 1
SigQ: 3/15023
SigPnd: 0000000000000000
ShdPnd: 0000000000004100
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 2000000181005ccf
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000001fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: 3
Cpus_allowed_list: 0-1
Mems_allowed:
*********(此處有省略)
Mems_allowed_list: 0
voluntary_ctxt_switches: 50
nonvoluntary_ctxt_switches: 14
[Hadoop@master 6976]$ kill -9 6975
[Hadoop@master 6976]$ jps
3728 ResourceManager
4277 Master
3095 NameNode
3863 NodeManager
7832 Jps
3450 SecondaryNameNode
4362 Worker
3245 DataNode
以上成功kill掉殭屍程序RunJar 6875