1. 程式人生 > 其它 >ZEEPLIN安裝整合CDH、Flink及Iceberg

ZEEPLIN安裝整合CDH、Flink及Iceberg

目錄

一、服務安裝 3

1 安裝包下載 3

2 服務安裝 3

2.1 環境配置 3

2.2 節點配置 3

2.3 服務啟動 4

2.4 服務訪問 4

二 簡單使用-flink 4

1 web端配置Interpreters 4

2 demo測試 5

三 簡單實用-iceberg 6

1 配置flink-iceberg jar包 6

2 demo測試 7

一、服務安裝

1 安裝包下載

下載地址

https://zeppelin.apache.org/download.html

2 服務安裝

服務安裝路徑:/opt/soft/ zeppelin-0.10.1-bin-all

cd /opt/soft

tar -zxvf zeppelin-0.10.1-bin-all.tgz

2.1 環境配置

cd zeppelin-0.10.1-bin-all/conf

#修改配置檔案

cp zeppelin-env.sh.template zeppelin-env.sh

vim zeppelin-env.sh #新增如下配置

export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera

export HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.yarn

export HADOOP_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop

export SPARK_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark

export MASTER=yarn-cluster

說明:java及hadoop相關配置路徑,根據環境對應適配

2.2 節點配置

cp zeppelin-site.xml.template zeppelin-site.xml

vim zeppelin-site.xml

<property>

<name>zeppelin.server.addr</name>

<value>**.**.**.**</value>

<description>Server binding address</description>

</property>

<property>

<name>zeppelin.server.port</name>

<value>****</value>

<description>Server port.</description>

</property>

說明:該處只需要配置具體的IP和埠即可,如果為叢集模式,則配置zeppelin.cluster.addr屬性即可。

2.3 服務啟動

./bin/zeppelin-daemon.sh start#啟動命令

./bin/zeppelin-daemon.sh stop

2.4 服務訪問

訪問地址:http://ip:port

二 簡單使用-flink

1 web端配置Interpreters

首先,右上角點選Interpreters

然後,檢索flink,進行配置項編輯,配置下面4項即可

FLINK_HOME=/opt/flink

HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.yarn

HIVE_CONF_DIR=/etc/hive/conf.cloudera.hive

flink.execution.mode=yarn

2 demo測試

建立新的note並編輯demo進行測試

%flink

val data=benv.fromElements("Hello Kobe","Hello Jordan","Hello James")

data.flatMap(record=>record.split("\\s"))

.map(word=>(word,1))

.groupBy(0)

.sum(1)

.print()

三 簡單實用-iceberg

1 配置flink-iceberg jar包

配置方式:

flink.execution.jars= /opt/flink/lib/iceberg-flink-runtime-1.13-0.13.1.jar

如果沒有配置的話報如下錯誤

Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath.

Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath.

Available factory identifiers are:

generic_in_memory

at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:319) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]

at org.apache.flink.table.factories.FactoryUtil.getCatalogFactory(FactoryUtil.java:455) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]

at org.apache.flink.table.factories.FactoryUtil.createCatalog(FactoryUtil.java:251) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]

2 demo測試

%flink.bsql

show catalogs;

CREATE CATALOG hive_catalog WITH (

'type'='iceberg',

'catalog-type'='hive',

'uri'='thrift://cdh-test01:9083',

'clients'='5',

'property-version'='1',

'warehouse'='hdfs://hdfsCluster/user/hive/warehouse'

);

use catalog hive_catalog;

select * from `hive_catalog`.`iceberg_db`.`sample`;