1. 程式人生 > 實用技巧 >Spark2 Can't write dataframe to parquet hive table : HiveFileFormat`. It doesn't match the specified format `ParquetFileFormat`

Spark2 Can't write dataframe to parquet hive table : HiveFileFormat`. It doesn't match the specified format `ParquetFileFormat`

7

I'm trying to save dataframe in table hive.

In spark 1.6 it's work but after migration to 2.2.0 it doesn't work anymore.

Here's the code:

blocs
      .toDF()
      .repartition($"col1", $"col2", $"col3", $"col4")
      .write
      .format("parquet")
      .mode(saveMode)
      .partitionBy("col1", "col2", "col3", "col4")
      .saveAsTable("db".tbl)

The format of the existing table project_bsc_dhr.bloc_views isHiveFileFormat. It doesn't match the specified formatParquetFileFormat.; org.apache.spark.sql.AnalysisException: The format of the existing table project_bsc_dhr.bloc_views isHiveFileFormat. It doesn't match the specified formatParquetFileFormat

.;

shareimprove this question editedFeb 20 '19 at 12:01 Valeriy 98222 gold badges1212 silver badges3838 bronze badges askedJan 9 '19 at 14:42 youssef grati 7111 silver badge44 bronze badges
  • have you got any solution ? i am facing same issue..can you please let me know what is the work around–
    BigD
    Feb 8 '19 at 11:42
  • Yes, i used insertInto instead of saveAsTable and i deleted partitionby. The code: blocs .toDF() .repartition($"col1", $"col2", $"col3", $"col4") .write .format("parquet") .insertInto("db".tbl)–youssef gratiFeb 9 '19 at 12:07
  • am using spark 2.3.0 .. is repartitions works on latest spark ?–BigDFeb 9 '19 at 15:34
add a comment

1 Answer

ActiveOldestVotes 8

I have just tried to use.format("hive")tosaveAsTableafter getting the error and it worked.

I also wouldnotrecommend to useinsertIntosuggested by the author, because it looks not type-safe (as much as this term can be applied to SQL API) and is error-prone in the way it ignores column names and uses position-base resolution.