1.6-1.7 定義agent 讀取日誌存入hdfs
阿新 • • 發佈:2021-07-10
一、定義agent,並執行
1、配置檔案
#計劃 ##收集hive的日誌,存到hdfs /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.log 命令:tail -f source:Exec source 在啟動時執行一個給定的Unix命令,並期望該程序在標準輸出上不斷地生成資料 channel:memory sink:hdfs #/user/root/flume/hive-logs/ ##準備agent配置檔案:flume-tail.conf 如下:
# The configuration file needs to define the sources, # the channels and the sinks. ####define agent a2.sources = r2 a2.channels = c2 a2.sinks = k2 ###define sources a2.sources.r2.type = exec a2.sources.r2.command = tail -f /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.log a2.sources.r2.shell = /bin/bash -c ###define channel a2.channels.c2.type = memory a2.channels.c2.capacity = 1000 a2.channels.c2.transactionCapacity = 100 ###define sink a2.sinks.k2.type = hdfs a2.sinks.k2.hdfs.path = hdfs://hadoop-senior.ibeifeng.com:8020/user/root/flume/hive-logs/ a2.sinks.k2.hdfs.fileType = DataStream a2.sinks.k2.hdfs.writeFormat = Text a2.sinks.k2.hdfs.batchSize = 10 ###bind the soures and sink to the channel a2.sources.r2.channels = c2 a2.sinks.k2.channel = c2
2、執行
##flume開始實時監聽抽取 [root@hadoop-senior flume-1.5.0-cdh5.3.6]# bin/flume-ng agent -c conf -n a2 -f conf/flume-tail.conf -Dflume.root.logger=DEBUG,console ##此時可以去hive中執行一些命令,產生日誌 ##hdfs檢視,已經抽取到很多檔案了 [root@hadoop-senior hadoop-2.5.0-cdh5.3.6]# bin/hdfs dfs -ls -R /user/root/flume/hive-logs/ -rw-r--r-- 3 root supergroup 1133 2019-05-08 13:43 /user/root/flume/hive-logs/FlumeData.1557294191838 -rw-r--r-- 3 root supergroup 534 2019-05-08 13:43 /user/root/flume/hive-logs/FlumeData.1557294191839 -rw-r--r-- 3 root supergroup 1056 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160087 -rw-r--r-- 3 root supergroup 408 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160088 -rw-r--r-- 3 root supergroup 1319 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160089 -rw-r--r-- 3 root supergroup 240 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160090 -rw-r--r-- 3 root supergroup 1083 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160091 -rw-r--r-- 3 root supergroup 255 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160092 -rw-r--r-- 3 root supergroup 122 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160093 -rw-r--r-- 3 root supergroup 956 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160094 -rw-r--r-- 3 root supergroup 515 2019-05-08 13:59 /user/root/flume/hive-logs/FlumeData.1557295160095.tmp
3、當hdfs的架構為HA時
##當hdfs的架構為HA時,define sink欄位定義: a2.sinks.k2.type = hdfs a2.sinks.k2.hdfs.path = hdfs://代理名:8020/user/root/flume/hive-logs/ 把主機名改為HA的代理名稱,然後可以直接把core-site.xml hdfs-site.xml放進flume的conf目錄中,讓flume讀取;