1. 程式人生 > >Hive匯出資料到本地CSV

Hive匯出資料到本地CSV

https://www.iteblog.com/archives/955.html

https://cloud.tencent.com/developer/article/1352376

https://blog.csdn.net/pzw_0612/article/details/48064697

https://blog.csdn.net/gezailushang/article/details/83586042

有五種方法:

一,先把Hive錶轉化為DataFrame,再基於DataFrame.writer.csv()(DataFrameWriter.csv)匯出到HDFS

df = spark.sql("select * from test.student3")

df.write.csv()

二,是pyspark利用spark.sql(sql_str),spark是HiveContext。

spark.sql("")

三,使用hive的insert語法匯出檔案,匯出的是hive的檔案,不是完整的csv檔案。

insert overwrite local directory '/url/lxb/hive'
row format delimited
fields terminated by ','
select * from table_name limit 100

四,hive執行引數-e

hive -e 'select * from test.student3' >> /usr/lxb/student.txt

hive -e 'set hive.execution.engine=tez; set hive.cli.print.header=true; set hive.resultset.use.unique.column.names=false; select * from database.table' | sed 's/x01/,/g' > /usr/lxb/hive/test.csv

五, 先把Hive錶轉化為DataFrame,再基於DataFrame.toPandas()轉化為pandas的DataFrame,然後再利用DataFrame.to_csv匯出到本地。

to_csv引數可以參照下面的連結:

https://blog.csdn.net/u010801439/article/details/80033341

https://blog.csdn.net/qton_csdn/article/details/70493196