1. 程式人生 > 其它 >logstash簡介及基本操作

logstash簡介及基本操作

一:logstash概述:

簡單來說logstash就是一根具備實時資料傳輸能力的管道,負責將資料資訊從管道的輸入端傳輸到管道的輸出端;與此同時這根管道還可以讓你根據自己的需求在中間加上濾網,Logstash提供裡很多功能強大的濾網以滿足你的各種應用場景。

logstash常用於日誌系統中做日誌採集裝置,最常用於ELK中作為日誌收集器使用

 

 

二:logstash作用:

集中、轉換和儲存你的資料,是一個開源的伺服器端資料處理管道,可以同時從多個數據源獲取資料,並對其進行轉換,然後將其傳送到你最喜歡的“儲存
1
三:logstash的架構:

logstash的基本流程架構:input | filter | output 如需對資料進行額外處理,filter可省略。

 

 

3.1 Input(輸入):採集各種樣式,大小和相關來源資料,從各個伺服器中收集資料。

資料往往以各種各樣的形式,或分散或集中地存在於很多系統中。Logstash 支援各種輸入選擇 ,可以在同一時間從眾多常用來源捕捉事件。能夠以連續的流式傳輸方式,輕鬆地從您的日誌、指標、Web 應用、資料儲存以及各種 AWS 服務採集資料。

 

 

3.2 Filter(過濾器)

用於在將event通過output發出之前對其實現某些處理功能。grok。
grok:用於分析結構化文字資料。目前 是logstash中將非結構化資料日誌資料轉化為結構化的可查詢資料的不二之選
[root@node1 ~]# rpm -ql logstash | grep "patterns$" grok定義模式結構化的位置。
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns/grok-patterns
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns/mcollective-patterns
[root@node1 ~]#

3.3 Output(輸出):將我們過濾出的資料儲存到那些資料庫和相關儲存中,。

 

 


3.4 總結:

inpust:必須,負責產生事件(Inputs generate events),常用:File、syslog、redis、beats(如:Filebeats)
filters:可選,負責資料處理與轉換(filters modify them),常用:grok、mutate、drop、clone、geoip
outpus:必須,負責資料輸出(outputs ship them elsewhere),常用:elasticsearch、file、graphite、statsd

 

 

 

 


四:安裝logstash環境。依據rpm包進行下載。
準備了四個節點,實驗備用;

節點 ip
192.168.126.128 node1
192.168.126.129 node2
192.168.126.130 node3
192.168.126.131 node4
[root@node1 ~]# rpm -ivh logstash-7.9.1.rpm
warning: logstash-7.9.1.rpm: Header V4 RSA/SHA512 Signature, key ID d88e42b4: NOKEY
Preparing... ################################# [100%]
Updating / installing...
1:logstash-1:7.9.1-1 ################################# [100%]
Using provided startup.options file: /etc/logstash/startup.options
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/pleaserun-0.0.31/lib/pleaserun/platform/base.rb:112: warning: constant ::Fixnum is deprecated
Successfully created system startup script for Logstash
[root@node1 ~]# vim /etc/profile.d/logstash.sh 新增可執行檔案路徑
export PATH=$PATH:/usr/share/logstash/bin
[root@node1 ~]# source /etc/profile.d/logstash.sh

[root@node1 ~]# java -version 基於logstash是jruby語言編寫,即需要java環境。
openjdk version "1.8.0_262"
OpenJDK Runtime Environment (build 1.8.0_262-b10)
OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)

五:logstash的工作流程。

input { 從哪個地方讀取,輸入資料。

}

filter { 依據grok模式對資料進行分析結構化

}

output { 將分析好的資料輸出儲存到哪些地方

}

例項一:我們以標準輸入,來輸出資料。

[root@node1 ~]# cd /etc/logstash/conf.d/ 預設logstash的配製檔案在這個目錄下
[root@node1 conf.d]# ls
[root@node1 conf.d]# vim shil.conf
input {
stdin { 標準輸入
}
}

output {
stdout { 標準輸入
codec => rubydebug 編碼格式ruby
}
}

[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf --config.debug 使用--config.debug進行驗證配置是否有錯誤
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2020-10-13 13:11:00.251 [main] runner - --config.debug was specified, but log.level was not set to 'debug'! No config info will be logged.
[INFO ] 2020-10-13 13:11:00.261 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}
[INFO ] 2020-10-13 13:11:00.319 [main] writabledirectory - Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[INFO ] 2020-10-13 13:11:00.340 [main] writabledirectory - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[WARN ] 2020-10-13 13:11:00.803 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2020-10-13 13:11:00.845 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"593d27c7-7f01-4bbc-a68c-e60c555d2f73", :path=>"/usr/share/logstash/data/uuid"}
[INFO ] 2020-10-13 13:11:02.670 [Converge PipelineAction::Create<main>] Reflections - Reflections took 44 ms to scan 1 urls, producing 22 keys and 45 values
[INFO ] 2020-10-13 13:11:03.627 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#<Thread:0x758c4715 run>"}
[INFO ] 2020-10-13 13:11:04.503 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.87}
[INFO ] 2020-10-13 13:11:04.567 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[INFO ] 2020-10-13 13:11:04.682 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2020-10-13 13:11:04.935 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf --config.debug
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2020-10-13 13:11:00.251 [main] runner - --config.debug was specified, but log.level was not set to 'debug'! No config info will be logged.
[INFO ] 2020-10-13 13:11:00.261 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}
[INFO ] 2020-10-13 13:11:00.319 [main] writabledirectory - Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[INFO ] 2020-10-13 13:11:00.340 [main] writabledirectory - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[WARN ] 2020-10-13 13:11:00.803 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2020-10-13 13:11:00.845 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"593d27c7-7f01-4bbc-a68c-e60c555d2f73", :path=>"/usr/share/logstash/data/uuid"}
[INFO ] 2020-10-13 13:11:02.670 [Converge PipelineAction::Create<main>] Reflections - Reflections took 44 ms to scan 1 urls, producing 22 keys and 45 values
[INFO ] 2020-10-13 13:11:03.627 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#<Thread:0x758c4715 run>"}
[INFO ] 2020-10-13 13:11:04.503 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.87}
[INFO ] 2020-10-13 13:11:04.567 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[INFO ] 2020-10-13 13:11:04.682 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2020-10-13 13:11:04.935 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
^C[WARN ] 2020-10-13 13:16:23.243 [SIGINT handler] runner - SIGINT received. Shutting down.
[INFO ] 2020-10-13 13:16:24.389 [Converge PipelineAction::Stop<main>] javapipeline - Pipeline terminated {"pipeline.id"=>"main"}
^C[FATAL] 2020-10-13 13:16:24.429 [SIGINT handler] runner - SIGINT received. Terminating immediately..
[ERROR] 2020-10-13 13:16:24.490 [LogStash::Runner] Logstash - org.jruby.exceptions.ThreadKill
[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2020-10-13 13:16:44.536 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}
[WARN ] 2020-10-13 13:16:44.920 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2020-10-13 13:16:46.360 [Converge PipelineAction::Create<main>] Reflections - Reflections took 37 ms to scan 1 urls, producing 22 keys and 45 values
[INFO ] 2020-10-13 13:16:47.108 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#<Thread:0x4df75091 run>"}
[INFO ] 2020-10-13 13:16:47.839 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.73}
[INFO ] 2020-10-13 13:16:47.900 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[INFO ] 2020-10-13 13:16:48.010 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2020-10-13 13:16:48.227 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
hello world 我們輸入一些欄位。
{
"host" => "node1", 當前主機
"message" => "hello world", 釋出的訊息
"@version" => "1", 版本號
"@timestamp" => 2020-10-13T06:08:07.476Z
}

例項二:我們通過grok來對日誌進行分析,讀取,標準輸出。

2.1我們自定義gork模式對日誌進行過濾。
語法格式:
%{SYNTAX:SEMANTIC}
SYNTAX:預定義模式名稱;
SEMANTIC:匹配到的文字的自定義識別符號;
[root@node1 conf.d]# vim groksimple.conf
input {
stdin {}
}

filter {
grok {
match => { "message" => "%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
}
}

output {
stdout {
codec => rubydebug
}
}
[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/groksimple.conf
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2020-10-13 14:29:41.936 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}
[WARN ] 2020-10-13 14:29:42.412 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2020-10-13 14:29:44.025 [Converge PipelineAction::Create<main>] Reflections - Reflections took 42 ms to scan 1 urls, producing 22 keys and 45 values
[INFO ] 2020-10-13 14:29:44.995 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/groksimple.conf"], :thread=>"#<Thread:0x4ca2f74b run>"}
[INFO ] 2020-10-13 14:29:45.749 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.74}
[INFO ] 2020-10-13 14:29:45.820 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[INFO ] 2020-10-13 14:29:45.902 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2020-10-13 14:29:46.098 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
1.1.1.1 get /index.html 30 0.23 我們標準輸入一些日誌資訊。
{
"@timestamp" => 2020-10-13T06:30:11.973Z,
"host" => "node1",
"@version" => "1",
"request" => "/index.html",
"message" => "1.1.1.1 get /index.html 30 0.23",
"duration" => "0.23",
"clientip" => "1.1.1.1",
"method" => "get",
"bytes" => "30"
}


例項三:將一些webserver伺服器產生的日誌進行過濾標準輸出

例如:apache產生的日誌,在grok中有特定的過濾apache日誌的結構。
[root@node1 conf.d]# vim httpdsimple.conf
input {
file { 從哪個檔案中獲取
path => ["/var/log/httpd/access_log"] 檔案路徑
type => "apachelog" 檔案型別
start_position => "beginning" 從最開始取資料
}
}

filter {
grok { 過濾分析格式
match => {"message" => "%{COMBINEDAPACHELOG}"} 過濾httpd日誌格式。
}
}

output {
stdout {
codec => rubydebug
}

[root@node4 conf.d]# logstash -f /etc/logstash/conf.d/httpd.conf --path.data=/tmp
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2020-10-13 17:57:53.743 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}
[INFO ] 2020-10-13 17:57:53.803 [main] writabledirectory - Creating directory {:setting=>"path.queue", :path=>"/tmp/queue"}
[INFO ] 2020-10-13 17:57:53.819 [main] writabledirectory - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/tmp/dead_letter_queue"}
[WARN ] 2020-10-13 17:57:54.315 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2020-10-13 17:57:54.352 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"99ab593e-1436-49c0-874a-e815644cb316", :path=>"/tmp/uuid"}
[INFO ] 2020-10-13 17:57:56.722 [Converge PipelineAction::Create<main>] Reflections - Reflections took 51 ms to scan 1 urls, producing 22 keys and 45 values
[INFO ] 2020-10-13 17:57:58.642 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/httpd.conf"], :thread=>"#<Thread:0x7878416d run>"}
[INFO ] 2020-10-13 17:57:59.663 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>1.01}
[INFO ] 2020-10-13 17:58:00.166 [[main]-pipeline-manager] file - No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/tmp/plugins/inputs/file/.sincedb_15940cad53dd1d99808eeaecd6f6ad3f", :path=>["/var/log/httpd/access_log"]}
[INFO ] 2020-10-13 17:58:00.198 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2020-10-13 17:58:00.316 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2020-10-13 17:58:00.356 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
[INFO ] 2020-10-13 17:58:00.852 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9603}

在瀏覽器直接訪問10.5.100.183

{
"@timestamp" => 2020-10-13T10:01:02.347Z,
"message" => "- - - [13/Oct/2020:18:01:01 +0800] \"GET / HTTP/1.1\" 304 - \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36\"",
"@version" => "1",
"tags" => [
[0] "_grokparsefailure"
],
"path" => "/var/log/httpd/access_log",
"type" => "apachelog",
"host" => "node4"
}
{
"@timestamp" => 2020-10-13T10:01:02.407Z,
"message" => "- - - [13/Oct/2020:18:01:01 +0800] \"GET /favicon.ico HTTP/1.1\" 404 209 \"http://10.5.100.183/\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36\"",
"@version" => "1",
"tags" => [
[0] "_grokparsefailure"
],
"path" => "/var/log/httpd/access_log",
"type" => "apachelog",
"host" => "node4"
}