SparkSQL執行報錯
在使用sparkSQL查詢hive中的資料時候報錯:
java.lang.ClassNotFoundException Class org.apache.hive.hcatalog.data.JsonSerDe not found java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found
hive資料是logservice投遞到oss上的,只使用hive sql的話可以得到資料,資料表結構如下:
CREATE EXTERNAL TABLE `workflow_operation`(
`level` string COMMENT 'from deserializer',
`location` string COMMENT 'from deserializer',
`message` string COMMENT 'from deserializer',
`thread` string COMMENT 'from deserializer',
`time` string COMMENT 'from deserializer')
PARTITIONED BY ( `year` string, `month` string, `day` string, `hour` string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 'oss://weidian-be-production/weidian_be_production/workflow_operation'
TBLPROPERTIES ( 'transient_lastDdlTime'='1535339427')
使用hive sql可以正常查詢,但是使用sparksql:spark-sql -e "use weidian_be_production;set io.compression.codec.snappy.native=true;select * from workflow_operation limit 3;"
會報如上錯誤 另外在hive叢集上已經配置了hive_aux_jars_path=/usr/lib/hive-current/lib/hive-hcatalog-core-2.3.3.jar 但是依然報錯。
解決辦法:把/usr/lib/hive-current/lib/hive-hcatalog-core-2.3.3.jar拷貝到 /usr/lib/spark-current/jars下