1. 程式人生 > >YARN 啟動後失敗退出——沒有請求資源——Invalid resource request, no resources request

YARN 啟動後失敗退出——沒有請求資源——Invalid resource request, no resources request

在ambari-server中修改了yarn的配置,重新啟動服務,結果RM啟動失敗,錯誤也很奇怪,“不合理的資源請求,沒有請求任何資源”!詳細如下:

2018-08-21 16:06:16,639 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105
) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:203) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1213
) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1254) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1250) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:
422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1250) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1301) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1492) Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:489) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:357) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:568) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1464) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:825) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ... 10 more 2018-08-21 16:06:16,656 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x36546c044dc0113 closed 2018-08-21 16:06:16,656 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down 2018-08-21 16:06:16,741 INFO resourcemanager.ResourceManager (LogAdapter.java:info(49)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down ResourceManager at ep-bd01/192.168.58.11

網上多方搜尋無解,最後無奈重新啟動主機,重啟所有服務,結果成功! 再次重啟RM,失敗,原因同上。

一、配置RM HA,這次啟動了,但是配置的兩個RM節點都是standby狀態! 期間再次修改配置檔案無數次,無效,錯誤資訊依然。

二、手工啟用一臺主機上的RM,失敗,錯誤原因相同

[[email protected] zookeeper]# yarn rmadmin -transitionToActive --forceactive --forcemanual rm1
You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably.

It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state.

You may abort safely by answering 'n' or hitting ^C now.

Are you sure you want to continue? (Y or N) y
......
......
18/08/29 14:31:10 WARN ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
        at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146)
        at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:896)
        at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:476)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:325)
        at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
        ... 4 more
Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:203)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1213)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1254)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1250)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1250)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
        ... 5 more

At Last! 經過好幾天的網上搜索以及思考,這個錯誤可能是HDP3.0的新錯誤資訊,和網上搜索到的一個問題有些類似,現象同樣是RM啟動成功後馬上掛掉! 其中提到可能是RM回覆application的狀態引起的故障,急忙實驗一下。

簡而言之,使用zookeeper命令刪除 /rmstore/ZKRMStateRoot/RMAppRoot 下面的所有子目錄。

然後重啟RM,沒想到困擾幾天的問題就這麼解決了,具體請看輸出吧(容我樂一會兒先)。

[[email protected] pg_log]# sudo -u zookeeper /usr/hdp/3.0.0.0-1634/zookeeper/bin/zkCli.sh

Connecting to localhost:
2181 2018-08-29 15:04:02,395 - INFO [main:[email protected]100] - Client environment:zookeeper.version=3.4.6-1634--1, built on 07/12/2018 20:01 GMT 2018-08-29 15:04:02,397 - INFO [main:[email protected]100] - Client environment:host.name=ep-bd03 2018-08-29 15:04:02,397 - INFO [main:[email protected]100] - Client environment:java.version=1.8.0_181 2018-08-29 15:04:02,398 - INFO [main:[email protected]100] - Client environment:java.vendor=Oracle Corporation 2018-08-29 15:04:02,398 - INFO [main:[email protected]100] - Client environment:java.home=/usr/java/jdk1.8.0_181-amd64/jre 2018-08-29 15:04:02,399 - INFO [main:[email protected]100] - Client environment:java.class.path=/usr/hdp/3.0.0.0-1634/zookeeper/bin/../build/classes:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../build/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/xercesMinimal-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-provider-api-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared4-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-lightweight-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-file-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-utils-3.0.8.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-interpolation-1.11.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-container-default-1.0-alpha-9-stable-1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/nekohtml-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-settings-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-repository-metadata-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-project-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-profile-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-plugin-registry-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-model-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-error-diagnostics-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-manager-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-ant-tasks-2.1.3.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jsoup-1.7.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-logging-1.1.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-io-2.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-codec-1.6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/classworlds-1.1-alpha-2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/backport-util-concurrent-3.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-launcher-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../zookeeper-3.4.6.3.0.0.0-1634.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../src/java/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../conf::/usr/share/zookeeper/* 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:java.io.tmpdir=/tmp 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:java.compiler=<NA> 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:os.name=Linux 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:os.arch=amd64 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:os.version=3.10.0-862.6.3.el7.x86_64 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:user.name=zookeeper 2018-08-29 15:04:02,399 - INFO [main:[email protected]] - Client environment:user.home=/var/lib/zookeeper 2018-08-29 15:04:02,400 - INFO [main:[email protected]] - Client environment:user.dir=/tmp/hsperfdata_zookeeper 2018-08-29 15:04:02,401 - INFO [main:[email protected]] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 [email protected] Welcome to ZooKeeper! 2018-08-29 15:04:02,417 - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2018-08-29 15:04:02,461 - INFO [main-SendThread(localhost:2181):[email protected]] - Socket connection established, initiating session, client: /127.0.0.1:7637, server: localhost/127.0.0.1:2181 [zk: localhost:2181(CONNECTING) 0] 2018-08-29 15:04:02,484 - INFO [main-SendThread(localhost:2181):[email protected]] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x3658450e5f202da, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls /rmstore [ZKRMStateRoot] [zk: localhost:2181(CONNECTED) 1] ls /rmstore/ZKRMStateRoot [ReservationSystemRoot, RMAppRoot, AMRMTokenSecretManagerRoot, EpochNode, RMDTSecretManagerRoot, RMVersionNode]

[zk: localhost:2181(CONNECTED) 6] ls /rmstore/ZKRMStateRoot/RMAppRoot
[application_1534904073745_0001, HIERARCHIES, application_1534904073745_0003, application_1534904073745_0002]

[zk: localhost:2181(CONNECTED) 3] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0001
[zk: localhost:2181(CONNECTED) 4] rmr /rmstore/ZKRMStateRoot/RMAppRoot/HIERARCHIES
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0003
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0002

[zk: localhost:2181(CONNECTED) 7] ls /rmstore/ZKRMStateRoot
/RMAppRoot
[]
[zk: localhost:2181(CONNECTED) 8] 

相關推薦

YARN 啟動失敗退出——沒有請求資源——Invalid resource request, no resources request

在ambari-server中修改了yarn的配置,重新啟動服務,結果RM啟動失敗,錯誤也很奇怪,“不合理的資源請求,沒有請求任何資源”!詳細如下: 2018-08-21 16:06:16,639 FATAL resourcemanager.ResourceManager (ResourceManager.

Supervisor實現Docker容器啟動退出

  製作Docker映象時一般會使用ENTRYPOINT來配置容器啟動時執行的命令,一般用於啟動一些服務。但是命令執行結束後,容器也會結束,會發現剛起的容器Exit(0)。一般可以用 ENTRYPOINT ["/sbin/init" ] 來實現容器起來後不退出,

[springboot]啟動馬上退出

開發十年,就只剩下這套架構體系了! >>>   

jenkins自動發布啟動tomcat失敗

自動發布 jenkins tomcat jenkins服務器上某個項目構建後執行自己寫的shell進行發布,腳本其他步驟都執行正常,唯獨,啟動tomcat出現問題。 jenkins顯示啟動tomcat成功。但是在tomcat服務器上查看進程卻發現沒有後臺進程存在。 原因: jenkins在腳本執

內核啟動,lcd顯示logo失敗

data reg control request sha sel ati 初始化 fine 針對-s5pv210,但對其他平臺也使用 lcd顯示logo失敗,若顯示成功默認的logo是一只企鵝,但是串口打印“Start display and show logo”,但是LC

IHS啟動無法訪問,沒有pid文件,沒有報錯

解決 alt his bsp each jre size 無法 報錯 IHS啟動之後 僅有一個進程如下 正常如下 error中沒有任何新的日誌--調了httpd.conf 中的日誌級別為 LogLevel debug,重啟IHS error日誌中: 查詢得知

redhat圖形界面啟動出現桌面但是沒有登錄界面解決辦法

dha pid 今天 col pin 回車 height import oat redhat圖形界面啟動後出現桌面但是沒有登錄界面解決辦法2014年07月11日 10:50:10閱讀數:7931redhat Linux一直用著好好地,今天打開只有圖像界面背景,沒有出現登陸

如何避免DockerPC蛋蛋源碼下載 容器啟動腳本運行自動退出

前臺 cpc 發現 後臺運行 信號 ron 機制 跟著 結束 docker build DocPC蛋蛋源碼下載 聯系方式:QQ:2747044651 網址http://zhengtuwl.com kerfile後,采用docker run --name xxx -d 運行

hadoop叢集啟動,發現所有程序都在,唯獨沒有master節點的namenode程序

這個時候,去logs/目錄下檢視日誌 cat hadoop-had_user-namenode-master.log 得到結果: java.io.IOException: There appears to be a gap in the edit log.  We expect

華為ENSP中AR啟動失敗錯誤程式碼40,42,43,及啟動一直#的問題的一種解決方案

系統是64位win10安裝ensp510時不斷40.42.43的錯且在不報錯時開啟ar時一直輸出#  查閱網上各種方法 一 一嘗試後發現, 我的問題是虛擬機器不是最新版本,且虛擬機器中沒有配置網絡卡,檢視是否有網絡卡配置,在virtualbox中點選左上角的管理,選擇全域性設定,然後在

php-fpm 啟動沒有監聽埠9000

netstat -an未發現監聽9000埠。檢視/var/log/php5-fpm.log一切正常。 隨後檢視centos/usr/local/php/etc/php-fpm.con (ubuntu:/etc/php5/fpm/pool.d/www.conf,) 發現li

Hbase自帶Zookeeper啟動,hmaster退出

啟動Hbase:start-hbase.sh starting master, logging to /root/training/hbase-1.3.1/logs/hbase-root-master-bigdata111.out Java HotSpot(TM) 64-B

在搭建Hadoop 分散式叢集的時候,多次格式化檔案系統,啟動hdfs,yarnjps 發現datanode為啟動

可以參考:https://www.cnblogs.com/dxwhut/p/5679501.html https://blog.csdn.net/baidu_15113429/article/details/53739734 https://www.cnblogs.com/lishpei/p

Emmagee 2.5 在MUMU模擬器上啟動直接失敗,換成夜神模擬器可以正常使用

MUMU模擬器安裝Emmagee後進行測試發現顯示浮動視窗很慢,顯示幾秒鐘後直接失敗 然後啟動後馬上就閃退了 換了個夜神模擬器,就能正常使用,使用的時候每隔幾秒就提示:已授予Emmagee超級使用者許可權,可能網易的模擬器做了特殊處理??

使用Docker構建nginx容器,並且啟動不會自動退出

為什麼docker執行後就自動退出? docker 容器預設會把容器內部第一個程序,也就是pid=1的程式作為docker容器是否正在執行的依據,如果docker 容器pid掛了,那麼docker容器便會直接退出。 docker run的時候把command做為容器內部命令,如果你使用nginx,那麼ng

安裝SQL2005出現伺服器啟動失敗或者安裝啟動伺服器失敗的原因及解決方法

紀念一下,本人在社群的第一篇部落格,在經歷了兩天無數重複的解答和失敗的嘗試後,終於找到了解決辦法,希望可以幫到其他人,同時表達一下對各社群翻來覆去的無用解答的憤慨。具體安裝過程可以參考郝斌老師的教程。點選開啟連結下面我分享一下我所經歷的失敗:安裝過程中的錯誤:1.安裝SQL時

避免啟動container執行shell指令碼執行完成docker退出

http://www.linuxdiyf.com/linux/28568.html 問題 最近在使用 Dockerfile 啟動容器,發現使用Dockerfile呼叫容器裡面的shell,當shell執行完成以後,docker會退出容器。 分析 Docker

Hadoop 叢集啟動,從節點的NodeManager沒有啟動解決

1.slaves節點報錯,報的是啟動nodemanager 所需記憶體不足 解決: a: 修改 yarn-site.xml 中的yarn.scheduler.minimum-allocation-mb 記憶體大於1024, b.修改 ya

如何避免Docker容器啟動指令碼執行自動退出——一個cron定時任務docker映象方案

近期想做一個cron定時任務的docker,在Dockerfile中做如下定義 FROM library/alpine:latest RUN apk --update add rsync openssh bash VOLUME ["/data"] ADD start.sh

Dubbo專案啟動沒有提供者。專案啟動日誌正常,DUBBO服務啟動沒有註冊到zookeeper。

專案啟動日誌正常,DUBBO服務啟動沒有註冊到zookeeper。 檢視zookeeper日誌發現如下錯誤資訊: EndOfStreamException: Unable to read additional data from client sessi