1. 程式人生 > 其它 >clickhouse輸入輸出格式之ORC

clickhouse輸入輸出格式之ORC

ORC資料的輸入輸出

僅支援ORC格式的寫入。

ORC和CH資料型別的匹配關係

ORC data type (INSERT)ClickHouse data type
UINT8, BOOLUInt8
INT8Int8
UINT16UInt16
INT16Int16
UINT32UInt32
INT32Int32
UINT64UInt64
INT64Int64
FLOAT, HALF_FLOATFloat32
DOUBLEFloat64
DATE32Date
DATE64, TIMESTAMPDateTime
STRING, BINARYString
DECIMALDecimal

備註:

  • 不支援的ORC資料型別:DATE32, TIME32, FIXED_SIZE_BINARY, JSON, UUID, ENUM。
  • ClickHouse表的列名必須與ORC表的列名一致。

使用Spark生成ORC檔案

val list = List(
  ("113.248.234.232", "123.212.22.01", "2018-07-12 14:35:31"),
  ("115.248.158.231", "154.245.56.23", "2020-07-12 13:26:26"),
  ("115.248.158.231", "154.245.56.23", "2020-07-12 13:22:13"
), ("187.248.135.230", "221.228.112.45", "2019-08-09 13:17:39"), ("187.248.234.232", "221.228.112.24", "2019-08-09 20:51:16"), ("115.248.158.231", "154.245.56.23", "2020-07-12 17:22:56") ) val rdd = sc.makeRDD(list)
import spark.implicits._ val df = rdd.toDF("srcip", "destip", "time") df.repartition(1).write.format("orc").mode("append").save("/tmp/orc")

建立測試表

create table orc_demo (srcip String, destip String, time DateTime) ENGINE=TinyLog;

資料匯入

cat file.orc | clickhouse-client --query="INSERT INTO test.orc_demo FORMAT ORC"

查詢結果

select * from orc_demo
在這裡插入圖片描述