大資料Flume系列之Flume叢集搭建
1. 概念
叢集的意思是多臺機器,最少有2臺機器,一臺機器從資料來源中獲取資料,將資料傳送到另一臺機器上,然後輸出。接下來就要實現Flume叢集搭建。叢集如下圖所示。
2. Flume搭建
2.1 部署準備
- 部署主機
192.168.9.139 host14
192.168.9.128 host15
-
host14主機下載flume軟體包
# cd /opt/tools # wget http://mirrors.tuna.tsinghua.edu.cn/apache/flume/1.7.0/apache-flume-1.7.0-bin.tar.gz
- 上傳解壓flume
# mkdir -p /apps/svr/flume/
# tar -zxvf /opt/tools/apache-flume-1.7.0-bin.tar.gz -C /apps/svr/flume/
2.2 部署Flume
部署的是叢集,需要在2臺機安裝Flume,host14作為push推送資料,host15作為pull獲取資料後顯示出來。
- 修改配置檔案
# cd /apps/svr/flume/apache-flume-1.7.0-bin/conf/ # cp flume-env.sh.template flume-env.sh # cp flume-conf.properties.template flume-telent.conf # vim flume-env.sh export JAVA_HOME=/apps/svr/java/jdk1.8.0_172
- host15主機部署Flume
# scp -r /apps/svr/flume/ host15:/apps/svr/
- 驗證flume
# /apps/svr/flume/apache-flume-1.7.0-bin/bin/flume-ng version Flume 1.7.0 Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git Revision: 511d868555dd4d16e6ce4fedc72c2d1454546707 Compiled by bessbd on Wed Oct 12 20:51:10 CEST 2016 From source with checksum 0d21b3ffdc55a07e1d08875872c00523
2.3 Flume叢集配置
- 配置push.conf
[host14]
# cd /apps/svr/flume/apache-flume-1.7.0-bin/conf
# vim push.conf
# Name the components on this agent
a2.sources= r1
a2.sinks= k1
a2.channels= c1
# Describe/configure the source
a2.sources.r1.type= spooldir
a2.sources.r1.spoolDir= /apps/svr/flume/logs
a2.sources.r1.channels= c1
# Use a channel which buffers events in memory
a2.channels.c1.type= memory
a2.channels.c1.keep-alive= 10
a2.channels.c1.capacity= 100000
a2.channels.c1.transactionCapacity= 100000
# Describe/configure the source
a2.sinks.k1.type= avro
a2.sinks.k1.channel= c1
a2.sinks.k1.hostname= host15
a2.sinks.k1.port= 8899
- 配置pull.conf
[host15]
# cd /apps/svr/flume/apache-flume-1.7.0-bin/conf
# vim pull.conf
# Name the components on this agent
a1.sources= r1
a1.sinks= k1
a1.channels= c1
# Describe/configure the source
a1.sources.r1.type= avro
a1.sources.r1.channels= c1
a1.sources.r1.bind= host15
a1.sources.r1.port= 8899
# Describe the sink
a1.sinks.k1.type= logger
a1.sinks.k1.channel = c1
# Use a channel which buffers events in memory
a1.channels.c1.type= memory
a1.channels.c1.keep-alive= 10
a1.channels.c1.capacity= 100000
a1.channels.c1.transactionCapacity= 100000
- 建立spoolDir目錄
[host14]
# mkdir -p /apps/svr/flume/logs
2.4 Flume叢集啟動
- 啟動pull主機
[host15]
# cd /apps/svr/flume/apache-flume-1.7.0-bin/
# ./bin/flume-ng agent -c conf -f conf/pull.conf -n a1 -Dflume.root.logger=INFO,console
顯示如圖所示則為啟動成功
- 啟動push主機
[host14]
# cd /apps/svr/flume/apache-flume-1.7.0-bin/
# ./bin/flume-ng agent -n a2 -c conf -f conf/push.conf -Dflume.root.logger=INFO,console
顯示如圖所示則為啟動成功
- 驗證連線
[host15]
顯示如圖所示表示連線成功
3. Flume測試
3.1 建立測試用例
[host14]
# cd /apps/svr/flume/logs/
# vim flume-use-case-test.log
HELLO WORLD!!!
HELLO FLUME!!!
3.2 驗證測試
- pull主機
顯示如圖所示表示測試成功
- push主機
顯示如圖所示表示測試成功
結論:用例測試成功,證明Flume叢集搭建成功。
原文地址:https://1csh1.github.io/2016/04/21/Flume%E9%9B%86%E7%BE%A4%E6%90%AD%E5%BB%BA/