Hive匯出資料到本地CSV
https://www.iteblog.com/archives/955.html
https://cloud.tencent.com/developer/article/1352376
https://blog.csdn.net/pzw_0612/article/details/48064697
https://blog.csdn.net/gezailushang/article/details/83586042
有五種方法:
一,先把Hive錶轉化為DataFrame,再基於DataFrame.writer.csv()(DataFrameWriter.csv)匯出到HDFS。
df = spark.sql("select * from test.student3")
df.write.csv()
二,是pyspark利用spark.sql(sql_str),spark是HiveContext。
spark.sql("")
三,使用hive的insert語法匯出檔案,匯出的是hive的檔案,不是完整的csv檔案。
insert overwrite local directory '/url/lxb/hive' row format delimited fields terminated by ',' select * from table_name limit 100
四,hive執行引數-e
hive -e 'select * from test.student3' >> /usr/lxb/student.txt
hive -e 'set hive.execution.engine=tez; set hive.cli.print.header=true; set hive.resultset.use.unique.column.names=false; select * from database.table' | sed 's/x01/,/g' > /usr/lxb/hive/test.csv
五, 先把Hive錶轉化為DataFrame,再基於DataFrame.toPandas()轉化為pandas的DataFrame,然後再利用DataFrame.to_csv匯出到本地。
to_csv引數可以參照下面的連結:
https://blog.csdn.net/u010801439/article/details/80033341
https://blog.csdn.net/qton_csdn/article/details/70493196