sparksql讀parquet表執行報錯
阿新 • • 發佈:2019-01-14
叢集記憶體:1024G(資料量:400G)
(1)報錯資訊:
Job aborted due to stage failure: Serialized task 2231:2304 was 637417604 bytes, which exceeds max allowed: spark.rpc.message.maxSize (134217728 bytes). Consider increasing spark.rpc.message.maxSize or using broadcast variables for large values.
(2)原因:
Driver端傳送的資料太大導致超過spark預設的傳輸限制
(3)解決方案:
增加配置資訊 spark.rpc.message.maxSize=1024
spark2-submit \ --class com.lhx.test \ --master yarn \ --deploy-mode cluster \ --conf spark.rpc.message.maxSize=1024 \ --driver-memory 30g \ --executor-memory 12g \ --num-executors 12 \ --executor-cores 3 \ --conf spark.yarn.driver.memoryOverhead=4096m \ --conf spark.yarn.executor.memoryOverhead=4096m \ ./test.jar