1. 程式人生 > >CDH中flume整合hive常見異常

CDH中flume整合hive常見異常

java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/streaming/RecordWriter

1、 沒有匯入依賴

2、 有可能maven沒有下載完整

3、 包衝突的問題

沒有依賴包----flume中缺少某個包

1、 根據異常資訊,確定缺少什麼包

根據網上的搜尋資訊,確定缺少某一個包: 

find / -name 'hive-hcatalog-core*'

根據link檔案過濾、版本對比、猜測等,優先選擇了一個jar

/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-hcatalog-core-1.1.0-cdh5.11.1.jar

2、 如果找到的包正好是自己要的包的話,將包放在什麼地方?

通過flume-ng啟動時產生的日誌資訊

/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/flume-ng/lib/*

3、 問題解決

cp /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-hcatalog-streaming-1.1.0-cdh5.11.1.jar /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/flume-ng/lib/

可以採用連結的方式來解決:

ln -s /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-hcatalog-streaming-1.1.0-cdh5.11.1.jar hive-hcatalog-streaming-1.1.0-cdh5.11.1.jar

異常:java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException

解決辦法:

ln -s /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-metastore-1.1.0-cdh5.11.1.jar hive-metastore-1.1.0-cdh5.11.1.jar

異常:java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.session.SessionState

解決辦法:

ln -s /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-exec-1.1.0-cdh5.11.1.jar hive-exec-1.1.0-cdh5.11.1.jar

異常:java.lang.ClassNotFoundException: org.apache.hadoop.hive.cli.CliSessionState

ln -s /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-cli-1.1.0-cdh5.11.1.jar /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/flume-ng/lib/hive-cli-1.1.0-cdh5.11.1.jar

異常:org.apache.commons.cli.MissingOptionException: Missing required option: n

在執行的時候忘記輸入-name

異常:java.lang.ClassNotFoundException: com.facebook.fb303.FacebookService$Iface

ln -s /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/libfb303-0.9.3.jar /opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/flume-ng/lib/libfb303-0.9.3.jar

異常:Cannot stream to table that has not been bucketed : {metaStoreUri='thrift://master:9083', database='default', table='t_pages', partitionVals=[] }

Hive對接的時候需要將表設定成桶表

create table t_pages(

date string,

user_id string,

session_id string,

page_id string,

action_time string,

search_keyword string,

click_category_id string,

click_product_id string,

order_category_ids string,

order_product_ids string,

pay_category_ids string,

pay_product_ids string,

city_id string

)

CLUSTERED BY (city_id)  INTO 20 BUCKETS

ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

異常:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat cannot be cast to org.apache.hadoop.hive.ql.io.AcidOutputFormat

AcidOutputFormat的類只有OrcOutputFormat, Hive表需要stored as orc

create table t_pages(

date string,

user_id string,

session_id string,

page_id string,

action_time string,

search_keyword string,

click_category_id string,

click_product_id string,

order_category_ids string,

order_product_ids string,

pay_category_ids string,

pay_product_ids string,

city_id string

)

CLUSTERED BY (city_id)  INTO 20 BUCKETS

ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

STORED AS ORC;

測試,在hive當中去看是否有當前資料

4、 修改sources

capacity 100 full, consider committing more frequently, increasing capacity, or increasing thread count

5、 最好將channel的儲存轉為檔案