Flume基礎(八):企業開發案例(五)
阿新 • • 發佈:2020-07-26
單資料來源多出口案例(Sink 組)
單 Source、Channel 多 Sink(負載均衡)如圖 7-3 所示。 圖 7-3 單 Source、Channel 多 Sink 1)案例需求:使用 Flume-1 監控檔案變動,Flume-1 將變動內容傳遞給 Flume-2,Flume-2負責儲存到 HDFS。同時 Flume-1 將變動內容傳遞給 Flume-3,Flume-3 也負責儲存到 HDFS 2)需求分析: 3)實現步驟: 0.準備工作 在/opt/module/flume/job 目錄下建立 group2 資料夾[atguigu@hadoop102 job]$ cd group2/
[atguigu@hadoop102 group2]$ touch flume-netcat-flume.conf
[atguigu@hadoop102 group2]$ vim flume-netcat-flume.conf
新增如下內容
# Name the components on this注:Avro 是由 Hadoop 創始人 Doug Cutting 建立的一種語言無關的資料序列化和 RPC 框架。 注:RPC(Remote Procedure Call)—遠端過程呼叫,它是一種通過網路從遠端計算機程式上請求服務,而不需要了解底層網路技術的協議。 2.建立 flume-flume-console1.conf 配置上級 Flume 輸出的 Source,輸出是到本地控制檯。 建立配置檔案並開啟agent a1.sources = r1 a1.channels = c1 a1.sinkgroups = g1 a1.sinks = k1 k2 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 a1.sinkgroups.g1.processor.type = load_balance a1.sinkgroups.g1.processor.backoff = true a1.sinkgroups.g1.processor.selector= round_robin a1.sinkgroups.g1.processor.selector.maxTimeOut=10000 # Describe the sink a1.sinks.k1.type = avro a1.sinks.k1.hostname = hadoop102 a1.sinks.k1.port = 4141 a1.sinks.k2.type = avro a1.sinks.k2.hostname = hadoop102 a1.sinks.k2.port = 4142 # Describe the channel a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinkgroups.g1.sinks = k1 k2 a1.sinks.k1.channel = c1 a1.sinks.k2.channel = c1
[atguigu@hadoop102 group2]$ touch flume-flume-console1.conf
[atguigu@hadoop102 group2]$ vim flume-flume-console1.conf
新增如下內容
# Name the components on this agent a2.sources = r1 a2.sinks = k1 a2.channels = c1 # Describe/configure the source a2.sources.r1.type = avro a2.sources.r1.bind = hadoop102 a2.sources.r1.port = 4141 # Describe the sink a2.sinks.k1.type = logger # Describe the channel a2.channels.c1.type = memory a2.channels.c1.capacity = 1000 a2.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a2.sources.r1.channels = c1 a2.sinks.k1.channel = c13.建立 flume-flume-console2.conf 配置上級 Flume 輸出的 Source,輸出是到本地控制檯。 建立配置檔案並開啟
[atguigu@hadoop102 group2]$ touch flume-flume-console2.conf
[atguigu@hadoop102 group2]$ vim flume-flume-console2.conf
新增如下內容
# Name the components on this agent a3.sources = r1 a3.sinks = k1 a3.channels = c2 # Describe/configure the source a3.sources.r1.type = avro a3.sources.r1.bind = hadoop102 a3.sources.r1.port = 4142 # Describe the sink a3.sinks.k1.type = logger # Describe the channel a3.channels.c2.type = memory a3.channels.c2.capacity = 1000 a3.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a3.sources.r1.channels = c2 a3.sinks.k1.channel = c24.執行配置檔案 分別開啟對應配置檔案:flume-flume-console2,flume-flume-console1,flume-netcat-flume。
[atguigu@hadoop102 flume]$ bin/flume-ng agent --conf conf/ --name a3 --conf-file job/group2/flume-flume-console2.conf -Dflume.root.logger=INFO,console [atguigu@hadoop102 flume]$ bin/flume-ng agent --conf conf/ --name a2 --conf-file job/group2/flume-flume-console1.conf -Dflume.root.logger=INFO,console [atguigu@hadoop102 flume]$ bin/flume-ng agent --conf conf/ --name a1 --conf-file job/group2/flume-netcat-flume.conf5. 使用 telnet 工具向本機的 44444 埠傳送內容
$ telnet localhost 44444
6. 檢視 Flume2 及 Flume3 的控制檯列印日誌