1. 程式人生 > >Flume將MySQL表資料存入到HBase

Flume將MySQL表資料存入到HBase

浪費了“黃金五年”的Java程式設計師,還有救嗎? >>>   

Flume將MySQL表資料存入到HBase

HBasesink的三種序列化模式

  • SimpleHbaseEventSerializer
  • RegexHbaseEventSerializer
  • SimpleAsyncHbaseEventSerializer

本示例使用SimpleHbaseEventSerializer序列化模式

一、在HBase中建立t_name

hbase(main):021:0> create 'default:table1', 'info'
Created table default:table1
Took 1.3042 seconds
=> Hbase::Table - table1

二、flume的配置檔案

agent.channels = ch1
agent.sinks = hbase-sink
agent.sources = sql-source
agent.channels.ch1.type = memory
agent.sources.sql-source.channels = ch1
agent.sources.sql-source.type = org.keedio.flume.source.SQLSource


agent.sources.sql-source.hibernate.connection.url = jdbc:mysql://192.168.1.69:3306/t_hadoop
agent.sources.sql-source.hibernate.connection.user = root  
agent.sources.sql-source.hibernate.connection.password = root
agent.sources.sql-source.table = t_name
agent.sources.sql-source.columns.to.select = *

agent.sources.sql-source.incremental.column.name = id
agent.sources.sql-source.incremental.value = 0

agent.sources.sql-source.run.query.delay=5000

agent.sources.sql-source.status.file.path = /home/lwenhao/flume
agent.sources.sql-source.status.file.name = sql-source.status


# sink 配置為HBaseSink 和 SimpleHbaseEventSerializer
agent.sinks.hbase-sink.type = org.apache.flume.sink.hbase.HBaseSink
#HBase表名
agent.sinks.hbase-sink.table = table1
#HBase表的列族名稱
agent.sinks.hbase-sink.columnFamily  = info
agent.sinks.hbase-sink.serializer = org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
#HBase表的列族下的某個列名稱
agent.sinks.hbase-sink.serializer.payloadColumn = id,sip,dip,sport,dport,protocol,flowvalue,createtime
# 組合sink和channel
agent.sinks.hbase-sink.channel = ch1

三、啟動flume

 bin/flume-ng agent --conf conf/ --name agent --conf-file conf/flume-hbase.conf -Dflume.root.logger=DEBUG,console

四、效果

欄位對應的值,配置的有點問題,資料已經存入hbase,該部分正在修改。使用org.apache.flume.sink.hbase.RegexHbaseEventSerializer用正則就可以。