logstash7.9.1-官方教程1-快速開始
阿新 • • 發佈:2020-10-16
本文是對官方教程的實操+翻譯。
機器配置
CentOS7.6,64位,16核,16GB
[sysoper@10-99-10-31 ~]$ cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [sysoper@10-99-10-31 ~]$ cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c 16 Intel(R) Xeon(R) Gold 5217 CPU @ 3.00GHz [sysoper@10-99-10-31 ~]$ cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l 16 [sysoper@10-99-10-31 ~]$ cat /proc/cpuinfo| grep "cpu cores"| uniq cpu cores : 1 [sysoper@10-99-10-31 ~]$ cat /proc/meminfo |grep MemTotal MemTotal: 16264896 kB [sysoper@10-99-10-31 logstashdir]$ uname -i x86_64
安裝JDK
略
安裝logstash
- 下載解壓
wget https://artifacts.elastic.co/downloads/logstash/logstash-7.9.1.tar.gz tar -zxvf logstash-7.9.1.tar.gz
- 執行最基本的Logstash管道驗證Logstash安裝
cd logstash-7.9.1 bin/logstash -e 'input { stdin { } } output { stdout {} }' # 輸入hello,看輸出 ctrl+d 退出logstash
使用logstash分析日誌
先下載官方提供的樣例日誌檔案:
wget https://download.elastic.co/demos/logstash/gettingstarted/logstash-tutorial.log.gz gzip logstash-tutorial.log.gz
使用Filebeats
本次建立Logstash管道之前,將配置並使用Filebeat將日誌行傳送到Logstash。
- 下載解壓
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.9.1-linux-x86_64.tar.gz tar -zxvf filebeat-7.9.1-linux-x86_64.tar.gz
- 修改/替換
filebeat.yml
檔案內容# 備份 mv filebeat.yml filebeat.yml.bak vim filebeat.yml # 內容: filebeat.inputs: - type: log paths: - /path/to/file/logstash-tutorial.log output.logstash: hosts: ["localhost:5044"]
- 啟動filebeat
- 執行
sudo ./filebeat -e -c filebeat.yml -d "publish"
- 可能碰到
Exiting: error loading config file: config file ("filebeat.yml") must be owned by the user identifier (uid=0) or root
問題 - 解決:
sudo su - root chown root filebeat.yml chmod go-w /etc/{beatname}/{beatname}.yml
- 參考官方地址
https://www.elastic.co/guide/en/beats/libbeat/5.3/config-file-permissions.html
- 再次執行
2020-09-18T09:19:37.870+0800 INFO [publisher] pipeline/retry.go:219 retryer: send unwait signal to consumer 2020-09-18T09:19:37.871+0800 INFO [publisher] pipeline/retry.go:223 done 2020-09-18T09:19:40.650+0800 ERROR [publisher_pipeline_output] pipeline/output.go:154 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp 127.0.0.1:5044: connect: connection refused
- Filebeat將嘗試連線埠5044。在Logstash啟動一個啟用的Beats外掛之前,該埠上不會有任何迴應,所以現在你看到的任何關於該埠連線失敗的訊息都是正常的。
- 執行
配置logstash-beats外掛
接下來,建立一個Logstash配置管道,它使用Beats input外掛來接收來自Beats的事件。
- 建立一個簡單的管道配置檔案
first-pipeline.conf
input { beats { port => "5044" } } # The filter part of this file is commented out to indicate that it is # optional. # filter { # # } output { stdout { codec => rubydebug } }
- 上述配置:輸入使用beats外掛、輸出使用標準控制檯列印(以後再修改輸出到ES)
- 驗證配置檔案
bin/logstash -f first-pipeline.conf --config.test_and_exit
- 驗證通過,則使用剛剛的配置啟動一個logstash例項
bin/logstash -f first-pipeline.conf --config.reload.automatic
--config.reload.automatic
允許自動配置重新載入,這樣您不必在每次修改配置檔案時停止並重新啟動Logstash。
- 啟動成功後,可以立即觀測到之前的filebeats終端連線上5044了:
2020-09-18T09:43:11.326+0800 INFO [publisher_pipeline_output] pipeline/output.go:151 Connection to backoff(async(tcp://localhost:5044)) established
,並且logstash這邊也拿到了日誌輸出:(其中一行){ "@version" => "1", "@timestamp" => 2020-09-18T01:19:35.207Z, "input" => { "type" => "log" }, "ecs" => { "version" => "1.5.0" }, "host" => { "name" => "10-99-10-31" }, "log" => { "file" => { "path" => "/app/sysoper/logstashdir/logstash-tutorial.log" }, "offset" => 21199 }, "tags" => [ [0] "beats_input_codec_plain_applied" ], "message" => "218.30.103.62 - - [04/Jan/2015:05:27:36 +0000] \"GET /projects/xdotool/xdotool.xhtml HTTP/1.1\" 304 - \"-\" \"Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)\"", "agent" => { "type" => "filebeat", "version" => "7.9.1", "hostname" => "10-99-10-31", "ephemeral_id" => "aad5f124-e37a-474a-9e50-c4317229df4b", "id" => "d7a76fd8-db13-45c8-99bd-3ae4dc3a3f92", "name" => "10-99-10-31" } }
使用Grok過濾器外掛對資料進行結構化
grok過濾器外掛是Logstash中預設可用的幾個外掛之一。logstash外掛管理。
grok filter外掛允許您將非結構化的日誌資料解析為結構化的、可查詢的資料。所以使用時,需要我們指定解析模式-即將有規律的文字資料結構化時遵循的模式。
- 修改
first-pipeline.conf
,新增過濾器配置,這裡使用自帶的%{COMBINEDAPACHELOG}
grok模式,該模式使用以下模式對來自Apache日誌的行進行http請求模式解析。grok { match => { "message" => "%{COMBINEDAPACHELOG}"} }
- 之前啟用了自動載入配置,所以logstash不用重啟
[2020-09-18T10:23:52,212][INFO ][logstash.pipelineaction.reload] Reloading pipeline {"pipeline.id"=>:main}
- 但由於Filebeat將它捕獲的每個檔案的狀態儲存在登錄檔中,因此刪除登錄檔檔案將強制Filebeat從頭讀取它捕獲的所有檔案。回到filebeats會話,執行
sudo rm -fr data/registry sudo ./filebeat -e -c filebeat.yml -d "publish"
- 回到logstash控制檯,可以看到輸出的JOSN資料發生變化。
{ "@version" => "1", "log" => { "file" => { "path" => "/app/sysoper/logstashdir/logstash-tutorial.log" }, "offset" => 19617 }, "ecs" => { "version" => "1.5.0" }, "host" => { "name" => "10-99-10-31" }, "response" => "200", "verb" => "GET", "tags" => [ [0] "beats_input_codec_plain_applied" ], "httpversion" => "1.1", "@timestamp" => 2020-09-18T02:26:47.845Z, "input" => { "type" => "log" }, "auth" => "-", "clientip" => "218.30.103.62", "request" => "/projects/fex/", "bytes" => "14352", "timestamp" => "04/Jan/2015:05:27:15 +0000", "ident" => "-", "message" => "218.30.103.62 - - [04/Jan/2015:05:27:15 +0000] \"GET /projects/fex/ HTTP/1.1\" 200 14352 \"-\" \"Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)\"", "agent" => { "type" => "filebeat", "version" => "7.9.1", "hostname" => "10-99-10-31", "id" => "d7a76fd8-db13-45c8-99bd-3ae4dc3a3f92", "ephemeral_id" => "90901fd8-81aa-4271-ad11-1ca77cb455e5", "name" => "10-99-10-31" }, "referrer" => "\"-\"" }
使用Geoip過濾器外掛加強資訊輸出
除了解析日誌資料以便更好地搜尋之外,過濾器外掛還可以從現有資料中獲得補充資訊。例如,geoip外掛查詢IP地址,從地址獲取地理位置資訊,並將該位置資訊新增到日誌中。
- 修改管道配置,新增Geoip過濾器
filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}"} } geoip { source => "clientip" } }
- 然後和前面操作一樣,清空filebeats登錄檔並重啟,就能看的logstash輸出變化了。
"httpversion" => "1.0", "geoip" => { "postal_code" => "32963", "region_name" => "Florida", "location" => { "lon" => -80.3757, "lat" => 27.689799999999998 },
將資料輸出導向(index)Elasticsearch
我們已經將web日誌分解為特定的欄位,並輸出到控制檯。現在可以將資料輸出導向Elasticsearch了。
You can run Elasticsearch on your own hardware or use our hosted Elasticsearch Service that is available on AWS, GCP, and Azure. Try the Elasticsearch Service for free.
- 本地快速安裝使用ES。
- 下載
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.1-linux-x86_64.tar.gz
- 解壓
tar -xvf elasticsearch-7.9.1-linux-x86_64.tar.gz
- 執行
./elasticsearch-7.9.1/bin/elasticsearch
- 下載
- 編輯第一個pipeline.conf檔案,並用下面的文字替換整個output部分:
output { elasticsearch { hosts => [ "localhost:9200" ] } }
- 和之前一樣重啟filebeats 。
至此,Logstash管道已經配置為將資料索引到一個Elasticsearch叢集中(這裡是本地單節點),可以查詢Elasticsearch了。 - 檢視ES中已有的index列表
curl 'localhost:9200/_cat/indices?v'
,結果:health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open logstash-2020.09.18-000001 7OxpUa9EQ9yv4w-_CqOYtw 1 1 100 0 335.7kb 335.7kb
- 根據index,查詢ES中目標資料
curl -XGET 'localhost:9200/logstash-2020.09.18-000001/_search?pretty&q=geoip.city_name=Buffalo'
- OK
*使用Kibana視覺化資料
- 下載解壓
wget https://artifacts.elastic.co/downloads/kibana/kibana-7.9.1-linux-x86_64.tar.gz tar -xvf kibana-7.9.1-linux-x86_64.tar.gz
- 編輯
config/kibana.yml
,設定elasticsearch.hosts
指向執行的ES例項:# The URLs of the Elasticsearch instances to use for all your queries. elasticsearch.hosts: ["http://localhost:9200"]
- 啟動Kibana
./kibana-7.9.1-linux-x86_64/bin/kibana
- 訪問
http://localhost:5601
檢視資料 - 問題1:kibana頁面地址訪問不到
修改kibana.yml
的server.host="0.0.0.0" - 問題2:找不到kibana的程序
使用ps -elf|grep node
或者netstat -tunlp|grep 5601