Ffume不同模式下的程式碼示例

阿新 • • 發佈：2018-12-21

Flume

一切盡在官網

flume的配置

flume程式碼示例

flume主要組成是agent，agent的組成分為Source（資料進入埠），Channel（資料管道），Sink（資料輸出端）

# example.conf: A single-node Flume configuration
#對agent的元件其名稱
# Name the components on this agent
//定義agent的名稱，對agent中的三個元件進行命名
//sources,sinks,channels 後加S所以可以同時定義多個，來適應不同的業務場景
a1.sources = r1
a1.sinks = k1
a1.channels = c1

//設定source端 source的資料來源是什麼？ 根據不同的資料來源，設定source的內容
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
//設定sinks端 sink端輸出的資料型別，根據不同的業務場景，來設定
# Describe the sink
a1.sinks.k1.type = logger
//設定channels channels存在的型別，大小為1000，每次傳輸的大小為100
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
//規定source、channel對應關係和channnel、sink對應的關係
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

模型簡介

不同的模型對應不同的業務邏輯

flume與flume之間連線使用的資料型別是AVRO
為了將資料通過多個代理或跳資料流，前一代理和當前跳轉源的接收器需要是 avro 型別，該接收器指向主機名(或 IP 地址)和源的埠。
多個flume連線

程式碼示例：在兩臺機器上

//從第一臺webService上獲取資料，傳輸送到第二臺機器上，寫入到第二臺機器的磁碟上

//第一臺上的配置檔案
//對agent元件進行命名
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
//source端的配置
a1.sources.r1.type = exec
a1.sources.r1.command = tail -f /home/bigdata/webapp.log
# Describe the sink
//sink短的配置
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = bigdata
a1.sinks.k1.port = 8888
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
# Bind the source and sink to the channel
//channel的配置
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


//第二臺的配置檔案
# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1
# Describe/configure the source
a2.sources.r1.type = avro
a2.sources.r1.bind = bigdata
a2.sources.r1.port = 8888
# Describe the sink
a2.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path = hdfs://bigdata:9000/logDir
# Use a channel which buffers events in memory
a2.channels.c1.type = memory
# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1

多個flume匯聚到一個flume在輸出

A very common scenario in log collection is a large number of log producing clients sending data to a few consumer agents that are attached to the storage subsystem. For example, logs collected from hundreds of web servers sent to a dozen of agents that write to HDFS cluster.
日誌集合中一個非常常見的場景是大量的日誌生成客戶端將資料傳送給附屬於儲存子系統的一些消費者代理。 例如，從數百個網路伺服器收集的日誌傳送給十幾個寫給 HDFS 叢集的代理

This can be achieved in Flume by configuring a number of first tier agents with an avro sink, all pointing to an avro source of single agent (Again you could use the thrift sources/sinks/clients in such a scenario). This source on the second tier agent consolidates the received events into a single channel which is consumed by a sink to its final destination
通過使用 avro 接收器配置一些一級代理可以在 Flume 中實現，這些代理都指向單個代理的 avro 源(在這種情況下，你可以使用節約源 / 匯 / 客戶端)。 第二層代理的這個源將接收到的事件合併到一個通道中，這個通道被一個接收器消耗到它的最終目的地

Flume supports multiplexing the event flow to one or more destinations. This is achieved by defining a flow multiplexer that can replicate or selectively route an event to one or more channels.
A fan-out flow using a (multiplexing) channel selector
The above example shows a source from agent “foo” fanning out the flow to three different channels. This fan out can be replicating or multiplexing. In case of replicating flow, each event is sent to all three channels. For the multiplexing case, an event is delivered to a subset of available channels when an event’s attribute matches a preconfigured value. For example, if an event attribute called “txnType” is set to “customer”, then it should go to channel1 and channel3, if it’s “vendor” then it should go to channel2, otherwise channel3. The mapping can be set in the agent’s configuration file.

支援將事件流向一個或多個目的地。 這是通過定義一個可以複製或選擇性地將事件路由到一個或多個通道的流多路器來實現的。
一種使用(多路複用)通道選擇器的扇形流
上面的例子顯示了來自"foo"的原始碼，將流程分散到三個不同的通道。 這個風扇可以複製或複用。 在複製流程的情況下，每個事件被髮送到所有三個通道。 對於多路複用情況，當事件的屬性與預配置值匹配時，將事件傳遞給可用的子集通道。 例如，如果一個被稱為"txnType"的事件屬性設定為"customer"，那麼它應該被引導到通道1和通道3，如果它是"供應商"，那麼它應該被引導到通道2，否則是通道3。 對映可以設定在代理的配置檔案。

flume中有多個sink輸出到不同的位置

第一臺機器收集flume日誌資訊

第二臺機器實時顯示第一臺的日誌資訊

第三臺機器將日誌資訊儲存到hdfs上

Flume通過檔案來讀取資料
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
#source端的根據資料型別來確定type
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /opt/app/hive-0.13.1-cdh5.3.6/logs/hive.log
a1.sources.r1.shell = /bin/bash -c

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname =hh4
a1.sinks.k1.port = 4141

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1





# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1

# Describe/configure the source
a2.sources.r1.type = netcat
a2.sources.r1.bind = hh4
a2.sources.r1.port = 44444

# Describe the sink
a2.sinks.k1.type = avro
a2.sinks.k1.hostname = hh4
a2.sinks.k1.port = 4141

# Use a channel which buffers events in memory
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1




# Name the components on this agent
a3.sources = r1
a3.sinks = k1
a3.channels = c1
# Describe/configure the source
a3.sources.r1.type = avro
a3.sources.r1.bind = hh4
a3.sources.r1.port = 4141
# Describe the sink
a3.sinks.k1.type = hdfs
a3.sinks.k1.hdfs.path = hdfs://hh4:8020/flume3/%Y%m%d/%H

#上傳檔案的字首
a3.sinks.k1.hdfs.filePrefix = flume3-
#是否按照時間滾動資料夾
a3.sinks.k1.hdfs.round = true
#多少時間單位建立一個新的資料夾
a3.sinks.k1.hdfs.roundValue = 1
#重新定義時間單位
a3.sinks.k1.hdfs.roundUnit = hour
#是否使用本地時間戳
a3.sinks.k1.hdfs.useLocalTimeStamp = true
#積攢多少個 Event 才 flush 到 HDFS 一次
a3.sinks.k1.hdfs.batchSize = 100
#設定檔案型別，可支援壓縮
a3.sinks.k1.hdfs.fileType = DataStream
#多久生成一個新的檔案
a3.sinks.k1.hdfs.rollInterval = 600
#設定每個檔案的滾動大小大概是 128M
a3.sinks.k1.hdfs.rollSize = 134217700
#檔案的滾動與 Event 數量無關
a3.sinks.k1.hdfs.rollCount = 0
#最小冗餘數
a3.sinks.k1.hdfs.minBlockReplicas = 1
# Describe the channel
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1

Ffume不同模式下的程式碼示例

Flume 一切盡在官網 flume的配置 flume程式碼示例 flume主要組成是agent，agent的組成分為Source（資料進入埠），Channel（資料管道），Sink（資料輸出端） # example.conf: A single

密碼庫LibTomCrypt學習記錄——（2.3）分組密碼演算法的工作模式——ECB程式碼示例

以下程式碼實現了AES-ECB的正確性測試（標準測試資料），以及效能測試說明： 1. 程式碼裡面使用了一個Str2Num函式，它將測試向量中的字串轉為十六進位制位元組資料，可自行實現。 2. 測試向量出處為NIST SP 800-38A (Recommendation for Bloc

Future三重奏第一章:Future設計模式及程式碼示例

Future系列文章 Future模式的作用去除等待主函式執行某項耗時操作的等待時間，在執行主函式耗時業務操作的時候，及時返回一個數據，繼續主函式剩下的業務，當需要獲取之前耗時操作的結果的時候在進行獲取其本質則是在維持主業務順利進行的同時，非同步的執行主業

密碼庫LibTomCrypt學習記錄——（2.11）分組密碼演算法的工作模式——CTR程式碼示例

以下程式碼實現了CBC的正確性測試（標準測試資料）說明： 1. 程式碼裡面使用了一個Str2Num函式，它將測試向量中的字串轉為十六進位制位元組資料，可自行實現。 2. 測試向量出處為NIST SP 800-38A (Recommendation for Block Cipher &n

密碼庫LibTomCrypt學習記錄——（2.9）分組密碼演算法的工作模式——OFB程式碼示例

OFB加密檔案示例 //#include <Windows.h> #include <stdio.h> #include <stdlib.h> #include "tomcrypt.h" #include "TestMode.h" //#

密碼庫LibTomCrypt學習記錄——（2.5）分組密碼演算法的工作模式——CBC程式碼示例

【JS設計模式】責任鏈模式的程式碼示例

責任鏈設計模式：在責任鏈模式裡，很多物件由每一個物件對其下家的引用而連線起來形成一條鏈。請求在這個鏈上傳遞，直到鏈上的某一個物件決定處理此請求。發出這個請求的客戶端並不知道鏈上的哪一個物件最終處理這個請求，這使得系統可以在不影響客戶端的情況下動態地重新組織和分配責任。責任鏈模

【JS設計模式】策略模式的程式碼示例

策略模式的概念引用：在軟體開發中也常常遇到類似的情況，實現某一個功能有多種演算法或者策略，我們可以根據環境或者條件的不同選擇不同的演算法或者策略來完成該功能。如查詢、排序等，一種常用的方法是硬編碼(Hard Coding)在一個類中，如需要提供多種查詢演算法，

設計模式：橋接模式及程式碼示例、橋接模式在jdbc中的體現、注意事項

# 0、背景加入一個手機分為多種款式，不同款式分為不同品牌。這些詳細分類下分別進行操作。如果傳統做法，需要將手機，分為不同的子類，再繼續分，基本屬於一個龐大的多叉樹，然後每個葉子節點進行相同名稱、但是細節不同的功能實現。 **問題**： 1. **類爆炸**：類的增加基本沒有任何優

workstation實現host only或nat模式下不同網段互通

虛擬機不同網段互通1.windows7下安裝workstation,在運行輸入 services.msc 找到【Routing and Remote Access】確認該服務已開啟 2.打開workstation,點擊【編輯】---【虛擬網絡編輯器】 3.在虛擬網絡編輯器中，點擊【添加】---【添加虛擬網絡

簡單工廠模式的go程式碼示例

簡單工廠模式很簡單，工廠負責生產物件，來看下： package main import ( "fmt" ) type BaseIntf interface { Operate(int, int) int } t

不同雲服務模式下的安全策略解析

無論是IaaS、PaaS，還是SaaS，都應該關注一些安全點，比如資料安全、加密和金鑰管理、身份識別和訪問管理、安全事件管理、業務連續性等等。但是對於不同的雲服務模式（IaaS、PaaS、SaaS），需要採取不同的安全策略。 IaaS層安全策略： IaaS涵蓋了從機房裝置到其中的硬體平

微信小程式如何實現下拉框效果？（程式碼示例）

wxml程式碼： <view class='top-text'> 選擇接收班級</view>  <view class='top-selected' bindtap='bindShowMsg'> <

VVC程式碼 BMS 幀內預測學習之六：Planar、DC及角度模式下預測值的計算

1、Planar模式，函式xPredIntraPlanar()：預測畫素是水平、垂直兩個方向上4個參考畫素的平均值。 left, top為預測畫素正左，正上方參考畫素值； right = leftColumn[height]- left, bottom =

VS.NET2008在Release模式下怎麼能對某行程式碼不進行優化呢

public class Main { public static void main(String[] args) { Scanner sc = new Scanner(System.in); String[] st = sc.nextLine().split("

python 檔案讀寫模式r,r+,w,w+,a,a+的區別（附程式碼示例）

如下表模式可做操作若檔案不存在是否覆蓋 r 只能讀報錯 - r+ 可讀可寫報錯是 w 只能寫建立是 w+　可讀可寫建立是

《大話設計模式》Java程式碼示例（三）之裝飾模式

裝飾模式（Decorator）：動態地給一個物件新增一些額外的職責，就增加功能來說，裝飾模式比生成子類更為靈活。 package decorator; /** * 裝飾模式(Decorator) * Person類 */ public class Perso

《大話設計模式》Java程式碼示例（四）之代理模式

代理模式（Proxy）：為其他物件提供一種代理以控制對這個物件的訪問。 package proxy; /** * 代理模式(Proxy) * 被追求者類 */ public class SchoolGirl { private String nam

《大話設計模式》Java程式碼示例（五）之工廠方法模式

工廠方法模式（Factory Method）：定義一個用於建立物件的介面，讓子類決定例項化哪一個類，工廠方法使一個類的例項化延遲到其子類。 package factorymethod; /** * 工廠方法模式(Factory Method) * 雷鋒類 */

《大話設計模式》Java程式碼示例（六）之原型模式

原型模式（Prototype）：用原型例項指定建立物件的種類，並且通過拷貝這些原型建立新的物件。一、淺拷貝：被複制物件的所有變數都含有與原來物件相同的值，而所有的對其他物件的引用都仍然指向原來的物件。 package prototype.bitwisecopy;

Ffume不同模式下的程式碼示例

相關推薦