1. 程式人生 > 實用技巧 >logstash7.9.1-官方教程1-快速開始

logstash7.9.1-官方教程1-快速開始

本文是對官方教程的實操+翻譯。

機器配置

CentOS7.6,64位,16核,16GB

[sysoper@10-99-10-31 ~]$ cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core) 
[sysoper@10-99-10-31 ~]$ cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c
16  Intel(R) Xeon(R) Gold 5217 CPU @ 3.00GHz
[sysoper@10-99-10-31 ~]$ cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
16
[sysoper@10-99-10-31 ~]$ cat /proc/cpuinfo| grep "cpu cores"| uniq
cpu cores       : 1
[sysoper@10-99-10-31 ~]$ cat /proc/meminfo |grep MemTotal
MemTotal:       16264896 kB
[sysoper@10-99-10-31 logstashdir]$ uname -i
x86_64

安裝JDK

安裝logstash

  • 下載解壓
    wget https://artifacts.elastic.co/downloads/logstash/logstash-7.9.1.tar.gz
    tar -zxvf logstash-7.9.1.tar.gz
    
  • 執行最基本的Logstash管道驗證Logstash安裝
    cd logstash-7.9.1
    bin/logstash -e 'input { stdin { } } output { stdout {} }'
    # 輸入hello,看輸出
    ctrl+d 退出logstash
    

使用logstash分析日誌

先下載官方提供的樣例日誌檔案:

wget https://download.elastic.co/demos/logstash/gettingstarted/logstash-tutorial.log.gz
gzip logstash-tutorial.log.gz

使用Filebeats

本次建立Logstash管道之前,將配置並使用Filebeat將日誌行傳送到Logstash。

  • 下載解壓
    wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.9.1-linux-x86_64.tar.gz
    tar -zxvf filebeat-7.9.1-linux-x86_64.tar.gz 
    
  • 修改/替換filebeat.yml檔案內容
    # 備份
    mv filebeat.yml filebeat.yml.bak 
    vim filebeat.yml
    
    # 內容:
    filebeat.inputs:
    - type: log
      paths:
        - /path/to/file/logstash-tutorial.log 
    output.logstash:
      hosts: ["localhost:5044"]
    
  • 啟動filebeat
    • 執行sudo ./filebeat -e -c filebeat.yml -d "publish"
    • 可能碰到Exiting: error loading config file: config file ("filebeat.yml") must be owned by the user identifier (uid=0) or root問題
    • 解決:
      sudo su - root
      chown root filebeat.yml
      chmod go-w /etc/{beatname}/{beatname}.yml
      
    • 參考官方地址https://www.elastic.co/guide/en/beats/libbeat/5.3/config-file-permissions.html
    • 再次執行
      2020-09-18T09:19:37.870+0800    INFO    [publisher]     pipeline/retry.go:219   retryer: send unwait signal to consumer
      2020-09-18T09:19:37.871+0800    INFO    [publisher]     pipeline/retry.go:223     done
      2020-09-18T09:19:40.650+0800    ERROR   [publisher_pipeline_output]     pipeline/output.go:154  Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp 127.0.0.1:5044: connect: connection refused
      
    • Filebeat將嘗試連線埠5044。在Logstash啟動一個啟用的Beats外掛之前,該埠上不會有任何迴應,所以現在你看到的任何關於該埠連線失敗的訊息都是正常的。

配置logstash-beats外掛

接下來,建立一個Logstash配置管道,它使用Beats input外掛來接收來自Beats的事件。

  • 建立一個簡單的管道配置檔案first-pipeline.conf
    input {
        beats {
            port => "5044"
        }
    }
    # The filter part of this file is commented out to indicate that it is
    # optional.
    # filter {
    #
    # }
    output {
        stdout { codec => rubydebug }
    }
    
  • 上述配置:輸入使用beats外掛、輸出使用標準控制檯列印(以後再修改輸出到ES)
  • 驗證配置檔案 bin/logstash -f first-pipeline.conf --config.test_and_exit
  • 驗證通過,則使用剛剛的配置啟動一個logstash例項 bin/logstash -f first-pipeline.conf --config.reload.automatic
    • --config.reload.automatic允許自動配置重新載入,這樣您不必在每次修改配置檔案時停止並重新啟動Logstash。
  • 啟動成功後,可以立即觀測到之前的filebeats終端連線上5044了:2020-09-18T09:43:11.326+0800 INFO [publisher_pipeline_output] pipeline/output.go:151 Connection to backoff(async(tcp://localhost:5044)) established,並且logstash這邊也拿到了日誌輸出:(其中一行)
    {
      "@version" => "1",
        "@timestamp" => 2020-09-18T01:19:35.207Z,
             "input" => {
            "type" => "log"
        },
               "ecs" => {
            "version" => "1.5.0"
        },
              "host" => {
            "name" => "10-99-10-31"
        },
               "log" => {
              "file" => {
                "path" => "/app/sysoper/logstashdir/logstash-tutorial.log"
            },
            "offset" => 21199
        },
              "tags" => [
            [0] "beats_input_codec_plain_applied"
        ],
           "message" => "218.30.103.62 - - [04/Jan/2015:05:27:36 +0000] \"GET /projects/xdotool/xdotool.xhtml HTTP/1.1\" 304 - \"-\" \"Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)\"",
             "agent" => {
                    "type" => "filebeat",
                 "version" => "7.9.1",
                "hostname" => "10-99-10-31",
            "ephemeral_id" => "aad5f124-e37a-474a-9e50-c4317229df4b",
                      "id" => "d7a76fd8-db13-45c8-99bd-3ae4dc3a3f92",
                    "name" => "10-99-10-31"
        }
    }
    

使用Grok過濾器外掛對資料進行結構化

grok過濾器外掛是Logstash中預設可用的幾個外掛之一。logstash外掛管理
grok filter外掛允許您將非結構化的日誌資料解析為結構化的、可查詢的資料。所以使用時,需要我們指定解析模式-即將有規律的文字資料結構化時遵循的模式。

  • 修改first-pipeline.conf,新增過濾器配置,這裡使用自帶的%{COMBINEDAPACHELOG} grok模式,該模式使用以下模式對來自Apache日誌的行進行http請求模式解析。
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
    
  • 之前啟用了自動載入配置,所以logstash不用重啟 [2020-09-18T10:23:52,212][INFO ][logstash.pipelineaction.reload] Reloading pipeline {"pipeline.id"=>:main}
  • 但由於Filebeat將它捕獲的每個檔案的狀態儲存在登錄檔中,因此刪除登錄檔檔案將強制Filebeat從頭讀取它捕獲的所有檔案。回到filebeats會話,執行
    sudo rm -fr data/registry
    sudo ./filebeat -e -c filebeat.yml -d "publish"
    
  • 回到logstash控制檯,可以看到輸出的JOSN資料發生變化。
    {
       "@version" => "1",
            "log" => {
              "file" => {
                "path" => "/app/sysoper/logstashdir/logstash-tutorial.log"
            },
            "offset" => 19617
        },
                "ecs" => {
            "version" => "1.5.0"
        },
               "host" => {
            "name" => "10-99-10-31"
        },
           "response" => "200",
               "verb" => "GET",
               "tags" => [
            [0] "beats_input_codec_plain_applied"
        ],
        "httpversion" => "1.1",
         "@timestamp" => 2020-09-18T02:26:47.845Z,
              "input" => {
            "type" => "log"
        },
               "auth" => "-",
           "clientip" => "218.30.103.62",
            "request" => "/projects/fex/",
              "bytes" => "14352",
          "timestamp" => "04/Jan/2015:05:27:15 +0000",
              "ident" => "-",
            "message" => "218.30.103.62 - - [04/Jan/2015:05:27:15 +0000] \"GET /projects/fex/ HTTP/1.1\" 200 14352 \"-\" \"Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)\"",
              "agent" => {
                    "type" => "filebeat",
                 "version" => "7.9.1",
                "hostname" => "10-99-10-31",
                      "id" => "d7a76fd8-db13-45c8-99bd-3ae4dc3a3f92",
            "ephemeral_id" => "90901fd8-81aa-4271-ad11-1ca77cb455e5",
                    "name" => "10-99-10-31"
        },
           "referrer" => "\"-\""
    }
    

使用Geoip過濾器外掛加強資訊輸出

除了解析日誌資料以便更好地搜尋之外,過濾器外掛還可以從現有資料中獲得補充資訊。例如,geoip外掛查詢IP地址,從地址獲取地理位置資訊,並將該位置資訊新增到日誌中。

  • 修改管道配置,新增Geoip過濾器
    filter {
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}"}
        }
        geoip {
            source => "clientip"
        }
    }
    
  • 然後和前面操作一樣,清空filebeats登錄檔並重啟,就能看的logstash輸出變化了。
    "httpversion" => "1.0",
          "geoip" => {
           "postal_code" => "32963",
           "region_name" => "Florida",
              "location" => {
            "lon" => -80.3757,
            "lat" => 27.689799999999998
        },
    

將資料輸出導向(index)Elasticsearch

我們已經將web日誌分解為特定的欄位,並輸出到控制檯。現在可以將資料輸出導向Elasticsearch了。

You can run Elasticsearch on your own hardware or use our hosted Elasticsearch Service that is available on AWS, GCP, and Azure. Try the Elasticsearch Service for free.

  • 本地快速安裝使用ES
    • 下載 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.1-linux-x86_64.tar.gz
    • 解壓 tar -xvf elasticsearch-7.9.1-linux-x86_64.tar.gz
    • 執行 ./elasticsearch-7.9.1/bin/elasticsearch
  • 編輯第一個pipeline.conf檔案,並用下面的文字替換整個output部分:
    output {
        elasticsearch {
            hosts => [ "localhost:9200" ]
        }
    }
    
  • 和之前一樣重啟filebeats 。
    至此,Logstash管道已經配置為將資料索引到一個Elasticsearch叢集中(這裡是本地單節點),可以查詢Elasticsearch了。
  • 檢視ES中已有的index列表 curl 'localhost:9200/_cat/indices?v',結果:
    health status index                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    yellow open   logstash-2020.09.18-000001 7OxpUa9EQ9yv4w-_CqOYtw   1   1        100            0    335.7kb        335.7kb
    
  • 根據index,查詢ES中目標資料 curl -XGET 'localhost:9200/logstash-2020.09.18-000001/_search?pretty&q=geoip.city_name=Buffalo'
  • OK

*使用Kibana視覺化資料

  • 下載解壓
    wget https://artifacts.elastic.co/downloads/kibana/kibana-7.9.1-linux-x86_64.tar.gz
    tar -xvf kibana-7.9.1-linux-x86_64.tar.gz
    
  • 編輯config/kibana.yml,設定elasticsearch.hosts指向執行的ES例項:
    # The URLs of the Elasticsearch instances to use for all your queries.
    elasticsearch.hosts: ["http://localhost:9200"]
    
  • 啟動Kibana ./kibana-7.9.1-linux-x86_64/bin/kibana
  • 訪問 http://localhost:5601 檢視資料
  • 問題1:kibana頁面地址訪問不到
    修改kibana.yml的server.host="0.0.0.0"
  • 問題2:找不到kibana的程序
    使用 ps -elf|grep node 或者 netstat -tunlp|grep 5601