分散式日誌收集系統：Flume

阿新 • • 發佈：2019-02-13

常見的分散式日誌收集系統

flume概述

flume原理圖

Flume知識點：

Event 是一行一行的資料
1.flume是分散式的日誌收集系統，把收集來的資料傳送到目的地去。
2.flume裡面有個核心概念，叫做agent。agent是一個java程序，執行在日誌收集節點。
3.agent裡面包含3個核心元件：source、channel、sink。
3.1 source元件是專用於收集日誌的，可以處理各種型別各種格式的日誌資料,包括avro、thrift、exec、jms、spooling directory、netcat、sequence generator、syslog、http、legacy、自定義。
source元件把資料收集來以後，臨時存放在channel中。
3.2 channel元件是在agent中專用於臨時儲存資料的，可以存放在memory、jdbc、file、自定義。
channel中的資料只有在sink傳送成功之後才會被刪除。
3.3 sink元件是用於把資料傳送到目的地的元件，目的地包括hdfs、logger、avro、thrift、ipc、file、null、hbase、solr、自定義。
4.在整個資料傳輸過程中，流動的是event。事務保證是在event級別。
5.flume可以支援多級flume的agent，支援扇入(fan-in)、扇出(fan-out)。
扇入指的是：source 可以接收多個輸入
扇出指的是：sink可以輸出多個目的地

Flume安裝：

1.分別解壓這兩個檔案在節點裡：
這裡寫圖片描述
2.把src內容複製到bin下：

cp -ri apache-flume-1.4.0-src/* apache-flume-1.4.0-bin/

3.src沒用可以刪掉了：

rm -rf apache-flume-1.4.0-src

4.重新命名apache-flume-1.4.0-bin 為flume：

mv apache-flume-1.4.0-bin/ flume

注意：flume安裝的前提是你已經安裝了hadoop ，因為它要用到hadoop的jar

5 . 書寫配置檔案example

agent1表示代理名稱:

agent1.sources=source1
agent1.sinks=sink1
agent1.channels=channel1

Spooling Directory是監控指定資料夾中新檔案的變化，一旦新檔案出現，就解析該檔案內容，然後寫入到channle。寫入完成後，標記該檔案已完成或者刪除該檔案。

配置source1

agent1.sources.source1.type=spooldir
agent1.sources.source1.spoolDir=/root/hmbbs
agent1.sources.source1.channels=channel1
agent1.sources.source1.fileHeader = false
agent1.sources.source1.interceptors = i1
agent1.sources.source1.interceptors.i1.type = timestamp

配置sink1

agent1.sinks.sink1.type=hdfs
agent1.sinks.sink1.hdfs.path=hdfs://hadoop0:9000/hmbbs
agent1.sinks.sink1.hdfs.fileType=DataStream
agent1.sinks.sink1.hdfs.writeFormat=TEXT
agent1.sinks.sink1.hdfs.rollInterval=1 //指定時間檔案被關閉
agent1.sinks.sink1.channel=channel1
agent1.sinks.sink1.hdfs.filePrefix=%Y-%m-%d //生成檔案的字首

配置channel1

agent1.channels.channel1.type=file
//備份目錄
agent1.channels.channel1.checkpointDir=/root/hmbbs_tmp/123
agent1.channels.channel1.dataDirs=/root/hmbbs_tmp/

把該檔案寫入flume 的conf資料夾下，並命名為example

6.在root目錄下建立資料夾hmbbs

[[email protected] /]# cd /root
[[email protected] ~]# ls
anaconda-ks.cfg Documents install.log Music Public Videos
Desktop Downloads install.log.syslog Pictures Templates
[[email protected] ~]# mkdir hmbbs

7.在hadoop下建立資料夾

hadoop fs -mkdir /hmbbs

8.執行flume
進入flume執行命令

bin/flume-ng agent -n agent1 -c conf -f conf/example -Dflume.root.logger=DEBUG,console

9．建立

[[email protected] ~]# vi hello
[[email protected] ~]# cp hello hmbbs

在hdfs裡會看到檔案傳輸了進去

10 .

[[email protected] ~]# cd hmbbs
[[email protected] hmbbs]# ls
hello.COMPLETED

紅色部分表示任務完成，已經傳輸到channel中，字尾.COMPLETED是重新命名後的結果。

[[email protected] ~]# cd hmbbs_tmp
[[email protected] hmbbs_tmp]# ls

hmbbs_tmp表示的是channel使用的目錄。

[[email protected] hmbbs_tmp]# cd 123
[[email protected] 123]# ls
checkpoint checkpoint.meta inflightputs inflighttakes

這裡的資料是備份資料，如果datadir裡資料丟失可以從這裡恢復。

分散式日誌收集系統：Flume

Flume知識點：

Flume安裝：

分散式日誌收集系統：Flume

Alex 的 Hadoop 菜鳥教程: 第22課分散式日誌收集元件：flume

分散式日誌收集系統：Facebook Scribe

分散式日誌收集系統 —— Flume

Linux搭建ELK日誌收集系統：FIlebeat+Redis+Logstash+Elasticse

改造apache的開源日誌專案來實現分散式日誌收集系統

Flume日誌收集系統架構詳解--轉

日誌收集系統Flume及其應用

Flume可分布式日誌收集系統

分散式日誌收集框架flume實戰

大資料技術學習筆記之網站流量日誌分析專案：Flume日誌採集系統1

分散式日誌收集框架Flume

分散式日誌收集框架Flume環境部署

基於flume的日誌收集系統配置

Flume(日誌收集系統)簡介

大資料學習筆記之flume----日誌收集系統

基於flume+kafka+storm日誌收集系統搭建

分散式日誌收集框架Flume:從指定網埠採集資料輸出到控制檯

基於Flume的美團日誌收集系統(二)改進和優化

flume分散式日誌採集系統實戰-陳耀武-專題視訊課程

分散式日誌收集系統：Flume

Flume知識點：

Flume安裝：

相關推薦