ZEEPLIN安裝整合CDH、Flink及Iceberg
目錄
一、服務安裝 3
1 安裝包下載 3
2 服務安裝 3
2.1 環境配置 3
2.2 節點配置 3
2.3 服務啟動 4
2.4 服務訪問 4
二 簡單使用-flink 4
1 web端配置Interpreters 4
2 demo測試 5
三 簡單實用-iceberg 6
1 配置flink-iceberg jar包 6
2 demo測試 7
一、服務安裝
1 安裝包下載
下載地址
https://zeppelin.apache.org/download.html |
2 服務安裝
服務安裝路徑:/opt/soft/ zeppelin-0.10.1-bin-all
cd /opt/soft tar -zxvf zeppelin-0.10.1-bin-all.tgz |
2.1 環境配置
cd zeppelin-0.10.1-bin-all/conf #修改配置檔案 cp zeppelin-env.sh.template zeppelin-env.sh vim zeppelin-env.sh #新增如下配置 export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera export HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.yarn export HADOOP_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop export SPARK_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark export MASTER=yarn-cluster |
說明:java及hadoop相關配置路徑,根據環境對應適配
2.2 節點配置
cp zeppelin-site.xml.template zeppelin-site.xml vim zeppelin-site.xml <property> <name>zeppelin.server.addr</name> <value>**.**.**.**</value> <description>Server binding address</description> </property> <property> <name>zeppelin.server.port</name> <value>****</value> <description>Server port.</description> </property> |
說明:該處只需要配置具體的IP和埠即可,如果為叢集模式,則配置zeppelin.cluster.addr屬性即可。
2.3 服務啟動
./bin/zeppelin-daemon.sh start#啟動命令 ./bin/zeppelin-daemon.sh stop |
2.4 服務訪問
訪問地址:http://ip:port
二 簡單使用-flink
1 web端配置Interpreters
首先,右上角點選Interpreters
然後,檢索flink,進行配置項編輯,配置下面4項即可
FLINK_HOME=/opt/flink HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.yarn HIVE_CONF_DIR=/etc/hive/conf.cloudera.hive flink.execution.mode=yarn |
2 demo測試
建立新的note並編輯demo進行測試
%flink val data=benv.fromElements("Hello Kobe","Hello Jordan","Hello James") data.flatMap(record=>record.split("\\s")) .map(word=>(word,1)) .groupBy(0) .sum(1) .print() |
三 簡單實用-iceberg
1 配置flink-iceberg jar包
配置方式:
flink.execution.jars= /opt/flink/lib/iceberg-flink-runtime-1.13-0.13.1.jar |
如果沒有配置的話報如下錯誤
Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath.
Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath. Available factory identifiers are: generic_in_memory at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:319) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6] at org.apache.flink.table.factories.FactoryUtil.getCatalogFactory(FactoryUtil.java:455) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6] at org.apache.flink.table.factories.FactoryUtil.createCatalog(FactoryUtil.java:251) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6] |
2 demo測試
%flink.bsql show catalogs; CREATE CATALOG hive_catalog WITH ( 'type'='iceberg', 'catalog-type'='hive', 'uri'='thrift://cdh-test01:9083', 'clients'='5', 'property-version'='1', 'warehouse'='hdfs://hdfsCluster/user/hive/warehouse' ); use catalog hive_catalog; select * from `hive_catalog`.`iceberg_db`.`sample`; |