flink clickhouse-jdbc和flink-connector 寫入資料到clickhouse因為jar包衝突導致的60 seconds.Please check if the requested resources are available in the YARN cluster和Could not resolve ResourceManager address akka報錯血案

阿新 • • 發佈：2021-08-14

一、問題現象，使用flink on yarn 模式，寫入資料到clickhouse，但是在yarn 叢集充足的情況下一直報：Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster，表面現象是 yarn 叢集資源可能不夠，實際yarn 叢集資源是夠用的。

檢視flinkjobmanager的日誌，發現日誌中一直在出現如下報錯：

Could not resolve ResourceManager address akka.tcp://[email protected]:38121/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://xxxxxxx.cn:38121/user/rpc/resourcemanager_*.

從這個日誌來看，也就基本可以確定不是yarn叢集資源的問題，是yarn 叢集通訊出現了問題。

1）、交叉驗證，發現提交別的flink streamling 任務都不會存在該問題，只有寫clickhouse的時候才會出現該問題，初步排除可能是程式碼問題或者該任務的jar包引起的。

2）、檢視pom依賴：

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-jdbc_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.11 
</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka_2.11</artifactId>
            <version>${flink.version}</version>
      </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>ru.yandex.clickhouse</groupId>
            <artifactId>clickhouse-jdbc</artifactId>
            <version>${clickhouse-jdbc.version}</version>
       </dependency>

<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
    <version>${mysql-connector-java.version}</version>
</dependency>

從日誌中雖然看不出明顯的jar包衝突問題，但是依然能從Could not resolve ResourceManager address akka.tcp://[email protected]:38121/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://xxxxxxx.cn:38121/user/rpc/resourcemanager_*. 聯想到可能是jar衝突或者jar包版本導致的衝突，導致connect 失敗。

聯想到hadoop 環境中，最容易出現衝突的包，如下所示，首先去排查。

                <groupId>com.google.guava</groupId>
                <artifactId>guava</artifactId>

　　然後發現，果然clickhouse-jdbc中存在這個包，如下所示

在pom中排除該包，如下所示

        <dependency>
            <groupId>ru.yandex.clickhouse</groupId>
            <artifactId>clickhouse-jdbc</artifactId>
            <version>${clickhouse-jdbc.version}</version>
            <exclusions>
            <exclusion>
                <groupId>com.google.guava</groupId>
                <artifactId>guava</artifactId>
            </exclusion>
            </exclusions>
        </dependency>

重新執行，問題得到解決。

二、問題啟示：

1、所有的日誌中沒有地方顯示程式碼衝突，表層現象為Could not resolve ResourceManager address akka.tcp://[email protected]:38121/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://xxxxxxx.cn:38121/user/rpc/resourcemanager_*. 很難聯想到jar包衝突，後來靈感來源於

https://blog.csdn.net/qq_31957747/article/details/108883793 這個篇博文，雖然發生衝突的jar是不一樣，但是問題很類似，所以朝這個方向去做了嘗試。發現jar包衝突，真的可能會帶來這個問題。

2、flink on yarn 模式中，最容易出現flink任務的jar包和hadoop叢集中的jar包衝突。在寫程式碼的時候，一般pom中可能是檢測不出來的，因為很多包不直接依賴。但是在flink run -m yarn-cluster 提交任務時，卻會使用到hadoop lib 下的classpath。所以這種衝突程式碼中很難檢測，實際中卻很容易出現。

3、不要被表面的現象迷惑，要能根據現象去看到本質，這樣才能解決到問題。

作者的原創文章，轉載須註明出處。原創文章歸作者所有，歡迎轉載，但是保留版權。對於轉載了博主的原創文章，不標註出處的，作者將依法追究版權，請尊重作者的成果。

flink clickhouse-jdbc和flink-connector 寫入資料到clickhouse因為jar包衝突導致的60 seconds.Please check if the requested resources are available in the YARN cluster和Could not resolve ResourceManager address akka報錯血案

一、問題現象，使用flink on yarn 模式，寫入資料到clickhouse，但是在yarn 叢集充足的情況下一直報：Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN

flink clickhouse-jdbc和flink-connector 寫入資料到clickhouse因為jar包衝突導致的60 seconds.Please check if the requested resources are available in the YARN cluster和Could not resolve ResourceManager address akka報錯血案

flink clickhouse-jdbc和flink-connector 寫入資料到clickhouse因為jar包衝突導致的60 seconds.Please check if the requested resources are available in the YARN cluster和Could not resolve ResourceManager address akka報錯血案

Springboot 報錯 Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986

FEM：整合RANSEQ和DNA甲基化資料分析的R包

553 This server does not accept routed mail 的報錯

centos yum安裝報錯could not resolve host: mirrorlist.centos.org

關於APACHE開啟時報錯“MAKE_SOCK: COULD NOT BIND TO ADDRESS [::]:443“

初始化 Hive 元資料庫報錯slf4j-log4j12-1.7.25.jar包衝突

brew update 報錯 "fatal: Could not resolve HEAD to a revision"

Mybatis Generator 路徑和實體類要放的路徑不一致導致Could not resolve type alias

SpringBoot專案連線ElasticSearch時報錯：None of the configured nodes are available

EF(EntityFramework) 插入或更新資料報錯的解決方法

解決python中import資料夾下面py檔案報錯問題

解決Navicat匯入資料庫資料結構sql報錯datetime(0)的問題

資料泵匯入ORA-39082報錯解決

【bug未解決】PCL將點雲寫入pcd檔案遇到報錯

pyspark調mysql報錯：“java.lang.ClassNotFoundException: com.mysql.jdbc.Driver”

Flink基礎（十四）：Table API 和 Flink SQL（三）流處理中的特殊概念

Flink基礎（十五）：Table API 和 Flink SQL（四）視窗（Windows）

Flink基礎（十六）：Table API 和 Flink SQL（五）函式（Functions）

解決Django響應JsonResponse返回json格式資料報錯問題

flink clickhouse-jdbc和flink-connector 寫入資料到clickhouse因為jar包衝突導致的60 seconds.Please check if the requested resources are available in the YARN cluster和Could not resolve ResourceManager address akka報錯血案

相關推薦