spark將hive表結果儲存至mysql表中BigDecimal精度問題解決。
阿新 • • 發佈:2021-02-10
技術標籤:Spark
問題描述:
hive表結果dataFrame 將row轉case時精度轉換時報錯:
Cannot up cast xxx from decimal(29,2) to decimal(38,18) as it may truncate
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot up cast `zskpje` from decimal(29,2) to decimal(38,18) as it may truncate
The type path of the target object is:
- field ( class: "scala.math.BigDecimal", name: "zskpje")
- root class: "com.xxx.bean.Inovice_Monthly"
You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object;
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast $.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveUpCast$$fail(Analyzer.scala:2292)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$$anonfun$apply$37$$anonfun$applyOrElse$15.applyOrElse(Analyzer.scala:2308)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$$anonfun $apply$37$$anonfun$applyOrElse$15.applyOrElse(Analyzer.scala:2303)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
錯誤原因:
val result = DwdDataDao.getmonthlyStatisticsData(sparkSession: SparkSession).as[Inovice_Monthly]
-
查詢hive結果 想將row轉成case儲存 生成DataSet
-
輸出結果dataframe schema
root
|-- NSR_SBH: string (nullable = true)
|-- INVOICE_TYPE: decimal(10,0) (nullable = true)
|-- TAX_RATE: decimal(18,2) (nullable = true)
|-- zskpje: decimal(29,2) (nullable = true)
......
而我們建立的case類(Inovice_Monthly)為BigDecimal 預設為(38,18)
想讓 DecimalType(10,0)->DecimalType(38,18) 或DecimalType(29,2)->DecimalType(38,18)顯然都是不可行的
case class Inovice_Monthly(
NSR_SBH: String,
INVOICE_TYPE: BigDecimal,
TAX_RATE: BigDecimal,
zskpje: BigDecimal,
......
)
Spark case class開發人員認為從scala推斷模式很方便,他們選擇不支援允許程式設計師為Decimal或中的BigDecimal型別指定精度和小數位數case class 請參閱https://issues.apache.org/jira/browse/SPARK-18484
解決方法:
本人是將case類中BigDecimal型別改為Double 然後將結果集每列對應修改型別。
result.withColumn("INVOICE_TYPE", result("INVOICE_TYPE").cast(DoubleType))
.withColumn("TAX_RATE", result("TAX_RATE").cast(DoubleType))
.withColumn("zskpje", result("zskpje").cast(DoubleType))
......
.as[Inovice_Monthly]
row轉成case無報錯