[Logstash-input-redis] 使用詳解
redis外掛的完整配置
input { redis { batch_count => 1 #返回的事件數量,此屬性僅在list模式下起作用。 data_type => "list" #logstash redis外掛工作方式 key => "logstash-test-list" #監聽的鍵值 host => "127.0.0.1" #redis地址 port => 6379 #redis埠號 password => "123qwe" #如果有安全認證,此項為密碼 db => 0 #redis資料庫的編號 threads => 1 #啟用執行緒數量 } } output { stdout{} }
工作流程
圖不夠專業,但是大致就如上圖所示:
- logstash啟動redis外掛
- redis外掛獲取引數,進行校驗工作
- 判斷監聽模式(list,channel,pattern_channel等),根據不同的監聽模式建立監聽任務
- 建立redis例項,繫結EVAL指令碼;通過指定的redis模式,傳送請求,監聽資料
- redis返回指定內容的數(可能是列表list,也可能是某個特定的頻道中的資料)
- 得到的資料,進行處理,返回給logstash
- 如果傳送了停止訊號,則根據不同的模式,傳送不同的命令退出redis。
原始碼剖析
首先是程式的自定義,這裡設定了redis外掛需要的引數,預設值,以及校驗等。
然後註冊Redis例項需要的資訊,比如key的名字或者url等,可以看到預設的data_type是list模式。
程式執行的主要入口,根據不同的data_type,傳遞不同的實現方法,然後呼叫listener_loop執行迴圈監聽
Listner_loop方法傳遞了兩個引數,一個是監聽器實現的方法,一個是處理的資料佇列。迴圈是每秒鐘執行一次,如果迴圈標識被設定,則退出。
上面的迴圈方法可以看到,是通過一個引數shutdown_requested來判斷是否繼續迴圈。該引數通過tear_down方法設定為true,然後根據不同的模式,指定不同的退出方式。
如果是list模式,則直接退出;如果是channel模式,則傳送redis的unsubsribe命令退出;如果是pattern_channel,則傳送punsubscribe退出。
在迴圈內部,判斷是否已經建立了redis例項,如果沒有建立,則呼叫connect方法建立;否則直接執行。
這裡前一段是呼叫Redis的new方法,初始化一個redis例項。緊接著判斷batch_count是否大於1,如果等於1,就什麼也不做,然後返回redis。
如果batch_count大於1,那麼就呼叫load_batch_script方法,載入Lua指令碼,儲存到redis中的lua指令碼字典中,供後面使用。程式碼如下:
上面的程式碼應該是這個外掛最難理解的部分了。為了弄清楚這段程式碼的工作,需要了解下面幾個知識點:
- lua指令碼基本概念
- Redis中的EVAL命令如何使用
- 理解上面指令碼的工作
首先,要想執行上面的指令碼,必須是Redis2.6+的版本,才支援EVAL,否則會報錯!EVAL命令與js中的差不多,就是可以把某一個字串當做命令解析,其中字串就包括lua指令碼。這樣有什麼好處呢?
說白了,就是能一次性進行多個操作。比如我們可以在指令碼中寫入一連串的操作,這些操作會以原子模式,一次性在伺服器執行完,在返回回來。
Lua指令碼
關於lua指令碼,其實沒有詳細研究的必要,但是一定要知道一個local和table的概念。local是建立本地的變數,這樣就不會汙染redis的資料。table是lua的一種資料結構,有點類似於json,可以儲存資料。
EVAL命令
另外還要知道EVAL命令的使用方法,看下面這個命令,就好理解了!
EVAL "return KEYS[1] KEYS[2] ARGV[1] ARGV[2];" 2 name:xing age:13
就會返回:
name
age
xing
13
這段程式碼沒有經過真正的操作,但是有助於理解就好!也就是說,EVAL後面跟著一段指令碼,指令碼後面跟著的就是引數,可以通過KEYS和ARGV陣列獲得,但是下標從1開始。
再來說說EVAL命令,它的執行過程如下:
- 解析字串指令碼,根據校驗和生成lua的方法
- 把校驗和和函式放入一個lua_script字典裡面,之後就可以通過EVALSHA命令直接使用校驗和執行函式。
有了這些理論基礎以後,就可以看看上面的程式碼都做了什麼了!
首先是獲取引數,這個引數賦值給i;然後建立了一個物件res;緊接著呼叫llen命令,獲得指定list的長度;如果list的長度大於i,則什麼也不做;如果小於i,那麼i就等於lenth;然後執行命令lpop,取出list中的元素,一共取i次,放入res中,最後返回。
說得通俗點,就是比較一下list元素個數與設定batch_count的值。如果batch_count為5,列表list中有5條以上的資料,那麼直接取5條,一次性返回;否則取length條返回。
可以看到這段指令碼的作用,就是讓logstash一次請求,最多獲得batch_count條事件,減小了伺服器處理請求的壓力。
講完這段程式碼,可以看看不同的工作模式的實現程式碼了:
首先是list的程式碼,其實就是執行BLPOP命令,獲取資料。如果在list模式中,還會去判斷batch_count的值,如果是1直接退出;如果大於1,則使用evalsha命令呼叫之前儲存的指令碼方法。
至於channel和pattern_channel,就沒啥解釋的了,就是分別呼叫subscribe和psubsribe命令而已。
其實最難理解的,就是中間那段lua指令碼~明白它的用處,redis外掛也就不難理解了。
完整的程式碼:
# encoding: utf-8
require "logstash/inputs/base"
require "logstash/inputs/threadable"
require "logstash/namespace"
# This input will read events from a Redis instance; it supports both Redis channels and lists.
# The list command (BLPOP) used by Logstash is supported in Redis v1.3.1+, and
# the channel commands used by Logstash are found in Redis v1.3.8+.
# While you may be able to make these Redis versions work, the best performance
# and stability will be found in more recent stable versions. Versions 2.6.0+
# are recommended.
#
# For more information about Redis, see <http://redis.io/>
#
# `batch_count` note: If you use the `batch_count` setting, you *must* use a Redis version 2.6.0 or
# newer. Anything older does not support the operations used by batching.
#
class LogStash::Inputs::Redis < LogStash::Inputs::Threadable
config_name "redis"
default :codec, "json"
# The `name` configuration is used for logging in case there are multiple instances.
# This feature has no real function and will be removed in future versions.
config :name, :validate => :string, :default => "default", :deprecated => true
# The hostname of your Redis server.
config :host, :validate => :string, :default => "127.0.0.1"
# The port to connect on.
config :port, :validate => :number, :default => 6379
# The Redis database number.
config :db, :validate => :number, :default => 0
# Initial connection timeout in seconds.
config :timeout, :validate => :number, :default => 5
# Password to authenticate with. There is no authentication by default.
config :password, :validate => :password
# The name of the Redis queue (we'll use BLPOP against this).
# TODO: remove soon.
config :queue, :validate => :string, :deprecated => true
# The name of a Redis list or channel.
# TODO: change required to true
config :key, :validate => :string, :required => false
# Specify either list or channel. If `redis\_type` is `list`, then we will BLPOP the
# key. If `redis\_type` is `channel`, then we will SUBSCRIBE to the key.
# If `redis\_type` is `pattern_channel`, then we will PSUBSCRIBE to the key.
# TODO: change required to true
config :data_type, :validate => [ "list", "channel", "pattern_channel" ], :required => false
# The number of events to return from Redis using EVAL.
config :batch_count, :validate => :number, :default => 1
public
def register
require 'redis'
@redis = nil
@redis_url = "redis://#{@password}@#{@host}:#{@port}/#{@db}"
# TODO remove after setting key and data_type to true
if @queue
if @key or @data_type
raise RuntimeError.new(
"Cannot specify queue parameter and key or data_type"
)
end
@key = @queue
@data_type = 'list'
end
if not @key or not @data_type
raise RuntimeError.new(
"Must define queue, or key and data_type parameters"
)
end
# end TODO
@logger.info("Registering Redis", :identity => identity)
end # def register
# A string used to identify a Redis instance in log messages
# TODO(sissel): Use instance variables for this once the @name config
# option is removed.
private
def identity
@name || "#{@redis_url} #{@data_type}:#{@key}"
end
private
def connect
redis = Redis.new(
:host => @host,
:port => @port,
:timeout => @timeout,
:db => @db,
:password => @password.nil? ? nil : @password.value
)
load_batch_script(redis) if @data_type == 'list' && (@batch_count > 1)
return redis
end # def connect
private
def load_batch_script(redis)
#A Redis Lua EVAL script to fetch a count of keys
#in case count is bigger than current items in queue whole queue will be returned without extra nil values
redis_script = <<EOF
local i = tonumber(ARGV[1])
local res = {}
local length = redis.call('llen',KEYS[1])
if length < i then i = length end
while (i > 0) do
local item = redis.call("lpop", KEYS[1])
if (not item) then
break
end
table.insert(res, item)
i = i-1
end
return res
EOF
@redis_script_sha = redis.script(:load, redis_script)
end
private
def queue_event(msg, output_queue)
begin
@codec.decode(msg) do |event|
decorate(event)
output_queue << event
end
rescue LogStash::ShutdownSignal => e
# propagate up
raise(e)
rescue => e # parse or event creation error
@logger.error("Failed to create event", :message => msg, :exception => e, :backtrace => e.backtrace);
end
end
private
def list_listener(redis, output_queue)
item = redis.blpop(@key, 0, :timeout => 1)
return unless item # from timeout or other conditions
# blpop returns the 'key' read from as well as the item result
# we only care about the result (2nd item in the list).
queue_event(item[1], output_queue)
# If @batch_count is 1, there's no need to continue.
return if @batch_count == 1
begin
redis.evalsha(@redis_script_sha, [@key], [@batch_count-1]).each do |item|
queue_event(item, output_queue)
end
# Below is a commented-out implementation of 'batch fetch'
# using pipelined LPOP calls. This in practice has been observed to
# perform exactly the same in terms of event throughput as
# the evalsha method. Given that the EVALSHA implementation uses
# one call to Redis instead of N (where N == @batch_count) calls,
# I decided to go with the 'evalsha' method of fetching N items
# from Redis in bulk.
#redis.pipelined do
#error, item = redis.lpop(@key)
#(@batch_count-1).times { redis.lpop(@key) }
#end.each do |item|
#queue_event(item, output_queue) if item
#end
# --- End commented out implementation of 'batch fetch'
rescue Redis::CommandError => e
if e.to_s =~ /NOSCRIPT/ then
@logger.warn("Redis may have been restarted, reloading Redis batch EVAL script", :exception => e);
load_batch_script(redis)
retry
else
raise e
end
end
end
private
def channel_listener(redis, output_queue)
redis.subscribe @key do |on|
on.subscribe do |channel, count|
@logger.info("Subscribed", :channel => channel, :count => count)
end
on.message do |channel, message|
queue_event message, output_queue
end
on.unsubscribe do |channel, count|
@logger.info("Unsubscribed", :channel => channel, :count => count)
end
end
end
private
def pattern_channel_listener(redis, output_queue)
redis.psubscribe @key do |on|
on.psubscribe do |channel, count|
@logger.info("Subscribed", :channel => channel, :count => count)
end
on.pmessage do |ch, event, message|
queue_event message, output_queue
end
on.punsubscribe do |channel, count|
@logger.info("Unsubscribed", :channel => channel, :count => count)
end
end
end
# Since both listeners have the same basic loop, we've abstracted the outer
# loop.
private
def listener_loop(listener, output_queue)
while [email protected]_requested
begin
@redis ||= connect
self.send listener, @redis, output_queue
rescue Redis::BaseError => e
@logger.warn("Redis connection problem", :exception => e)
# Reset the redis variable to trigger reconnect
@redis = nil
sleep 1
end
end
end # listener_loop
public
def run(output_queue)
if @data_type == 'list'
listener_loop :list_listener, output_queue
elsif @data_type == 'channel'
listener_loop :channel_listener, output_queue
else
listener_loop :pattern_channel_listener, output_queue
end
rescue LogStash::ShutdownSignal
# ignore and quit
end # def run
public
def teardown
@shutdown_requested = true
if @redis
if @data_type == 'list'
@redis.quit rescue nil
elsif @data_type == 'channel'
@redis.unsubscribe rescue nil
@redis.connection.disconnect
elsif @data_type == 'pattern_channel'
@redis.punsubscribe rescue nil
@redis.connection.disconnect
end
@redis = nil
end
end
end # class LogStash::Inputs::Redis