1. 程式人生 > >spark序列化溢位

spark序列化溢位

序列化快取溢位

Causedby:org.apache.spark.SparkException:Kryo序列化失敗:緩衝區溢位。可用:0,必需:21.要避免此情況,請增加spark.kryoserializer.buffer.max

Caused by:org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow.Available: 0, required: 21. To avoid this, increasespark.kryoserializer.buffer.max value.

atorg.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)

atorg.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)

atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

atjava.lang.Thread.run(Thread.java:745)

 val sparkConf = newSparkConf().setAppName(Constants.SPARK_NAME_APP)

     .set("spark.kryoserializer.buffer.max","128");

原因分析: RDD extends scala.AnyRef withscala.Serializable  ,所以在使用textFile ,讀取表的資料,等大量建立新的rdd,df,ds等 資料集的時候,注意把 這個值調大