1. 程式人生 > 實用技巧 >Logstash簡介與配置&logstash收集Java日誌

Logstash簡介與配置&logstash收集Java日誌

1.簡介與工作流程

  Logstash是採用ruby語言開發的。logstash與beats一樣,是一個data shipper,只不過logstash比較重量級,支援的功能也多。

1.簡介

  官方的解釋是:轉換和儲存資料

  Logstash 是免費且開放的伺服器端資料處理管道,能夠從多個來源採集資料,轉換資料,然後將資料傳送到您最喜歡的“儲存庫”中。

  Logstash 能夠動態地採集、轉換和傳輸資料,不受格式或複雜度的影響。利用 Grok 從非結構化資料中派生出結構,從 IP 地址解碼出地理座標,匿名化或排除敏感欄位,並簡化整體處理過程。

2.工作流程

1. Input輸入-採集各種樣式、大小和來源的資料

  資料往往以各種各樣的形式,或分散或集中地存在於很多系統中。Logstash 支援各種輸入選擇,可以同時從眾多常用來源捕捉事件。能夠以連續的流式傳輸方式,輕鬆地從您的日誌、指標、Web 應用、資料儲存以及各種 AWS 服務採集資料。

  關於其輸入支援的外掛參考:輸入外掛

2.Filter篩選-實時解析和轉換資料

  資料從源傳輸到儲存庫的過程中,Logstash 過濾器能夠解析各個事件,識別已命名的欄位以構建結構,並將它們轉換成通用格式,以便進行更強大的分析和實現商業價值。

  Logstash 能夠動態地轉換和解析資料,不受格式或複雜度的影響:利用 Grok 從非結構化資料中派生出結構、從 IP 地址破譯出地理座標、將 PII 資料匿名化,完全排除敏感欄位、簡化整體處理,不受資料來源、格式

或架構的影響。

  使用豐富的過濾器庫和功能多樣的 Elastic Common Schema,可以實現無限豐富的可能。

3. Output輸出-選擇儲存庫,匯出資料

  Elasticsearch 是首選輸出方向,能夠為搜尋和分析帶來無限可能,但它並非唯一選擇。Logstash 提供眾多輸出選擇,您可以將資料傳送到您要指定的地方,並且能夠靈活地解鎖眾多下游用例。

2.下載安裝

1. 下載logstash

2. 解壓後目錄如下:

3. 檢視logstash/config 目錄:

logstash-sample.conf樣本配置如下:

# Sample Logstash configuration for
creating a simple # Beats -> Logstash -> Elasticsearch pipeline. input { beats { port => 5044 } } output { elasticsearch { hosts => ["http://localhost:9200"] index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}" #user => "elastic" #password => "changeme" } }

3. 入門

1. 收集nginx的訪問日誌

  以控制檯的方式進行蒐集,便於除錯

(1)檢視logstash的兩條日誌(因為我裝的有git,所以windows可以用linux的相關命令)

liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs
$ pwd
/e/nginx/nginx-1.12.2/logs

liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs
$ head -n 2 ./access.log
127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] "GET /Test.html HTTP/1.1" 200 142 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"

(2) $logstash/config/ 目錄下建立logstash_nginx.conf,內容如下:

input {
  stdin { }
}

filter {
  grok {
    match => {
      "message" => '%{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] "%{WORD:request_action} %{DATA:request} HTTP/%{NUMBER:http_version}" %{NUMBER:response} %{NUMBER:bytes} "%{DATA:referrer}" "%{DATA:agent}"'
    }
  }

  date {
    match => [ "time", "dd/MMM/YYYY:HH:mm:ss Z" ]
    locale => en
  }

  geoip {
    source => "remote_ip"
    target => "geoip"
  }

  useragent {
    source => "agent"
    target => "user_agent"
  }
}

output {
stdout {
 codec => rubydebug 
 }
}

  grok 將非格式化的日誌資訊轉化為JSON格式的資訊。

  date換磚時間

  geoip:獲取地理位置

  useragent提取使用者的來源裝置

(3) 測試日誌收集:

liqiang@root MINGW64 /e/ELK/logstash-7.6.2
$ head -n 2 /e/nginx/nginx-1.12.2/logs/access.log | /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_nginx.conf
Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties
[2020-08-23T12:31:18,218][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-08-23T12:31:18,857][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
[2020-08-23T12:31:25,229][INFO ][org.reflections.Reflections] Reflections took 122 ms to scan 1 urls, producing 20 keys and 40 values
[2020-08-23T12:31:36,465][INFO ][logstash.filters.geoip   ][main] Using geoip database {:path=>"E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/logstash-filter-geoip-6.0.3-java/vendor/GeoLite2-City.mmdb"}
[2020-08-23T12:31:36,994][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-08-23T12:31:37,019][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_nginx.conf"], :thread=>"#<Thread:0x1e1b9b66 run>"}
[2020-08-23T12:31:40,502][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-08-23T12:31:40,731][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
         "user_name" => "-",
          "@version" => "1",
              "host" => "root",
      "http_version" => "1.1",
             "bytes" => "142",
              "tags" => [
        [0] "_geoip_lookup_failure"
    ],
    "request_action" => "GET",
          "referrer" => "-",
             "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
          "response" => "200",
             "geoip" => {},
              "time" => "09/Mar/2018:17:48:00 +0800",
           "request" => "/Test.html",
         "remote_ip" => "127.0.0.1",
        "@timestamp" => 2018-03-09T09:48:00.000Z,
           "message" => "127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] \"GET /Test.html HTTP/1.1\" 200 142 \"-\" \"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36\"\r",
        "user_agent" => {
           "minor" => "0",
           "build" => "",
            "name" => "Chrome",
        "os_major" => "8",
        "os_minor" => "1",
          "device" => "Other",
         "os_name" => "Windows",
           "major" => "64",
           "patch" => "3282",
              "os" => "Windows"
    }
}
{
         "user_name" => "-",
          "@version" => "1",
              "host" => "root",
      "http_version" => "1.1",
             "bytes" => "612",
              "tags" => [
        [0] "_geoip_lookup_failure"
    ],
    "request_action" => "GET",
          "referrer" => "-",
             "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
          "response" => "200",
             "geoip" => {},
              "time" => "09/Mar/2018:17:45:59 +0800",
           "request" => "/",
         "remote_ip" => "127.0.0.1",
        "@timestamp" => 2018-03-09T09:45:59.000Z,
           "message" => "127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36\"\r",
        "user_agent" => {
           "minor" => "0",
           "build" => "",
            "name" => "Chrome",
        "os_major" => "8",
        "os_minor" => "1",
          "device" => "Other",
         "os_name" => "Windows",
           "major" => "64",
           "patch" => "3282",
              "os" => "Windows"
    }
}
[2020-08-23T12:31:43,718][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-08-23T12:31:44,453][INFO ][logstash.runner          ] Logstash shut down.

2. 收集java日誌

  將java日誌收集到ES中。

1. springboot的web專案使用logback直接輸出到logstash中

(1) 配置logstash,監聽tcp埠4560並且啟動logstash

$logstash/config目錄下新建logstash_java.conf

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
 
input {
 tcp {
 mode => "server"
 host => "127.0.0.1"
 port => 4560
 codec => json_lines
 }
}
output {
 elasticsearch {
 hosts => "127.0.0.1:9200"
 index => "springboot-logstash-%{+YYYY.MM.dd}"
 }
}

(2)啟動logstash

$ /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_java.conf

(3)springboot專案pom中引入依賴

        <!--logStash -->
        <dependency>
            <groupId>net.logstash.logback</groupId>
            <artifactId>logstash-logback-encoder</artifactId>
            <version>5.3</version>
        </dependency>

(4)src/main/resources下新建logback-spring.xml

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <include resource="org/springframework/boot/logging/logback/base.xml" />
    <appender name="LOGSTASH"
        class="net.logstash.logback.appender.LogstashTcpSocketAppender">
        <!--配置logStash 服務地址 -->
        <destination>127.0.0.1:4560</destination>
        <!-- 日誌輸出編碼 -->
        <encoder charset="UTF-8"
            class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
            <providers>
                <timestamp>
                    <timeZone>UTC</timeZone>
                </timestamp>
                <pattern>
                    <pattern>
                        {
                        "logLevel": "%level",
                        "serviceName": "${springAppName:-}",
                        "pid": "${PID:-}",
                        "thread": "%thread",
                        "class": "%logger{40}",
                        "rest": "%message"
                        }
                    </pattern>
                </pattern>
            </providers>
        </encoder>
    </appender>

    <root level="DEBUG">
        <appender-ref ref="LOGSTASH" />
        <appender-ref ref="CONSOLE" />
    </root>
</configuration>

(5)啟動應用後檢視日誌:

(6)kibana建立index pattern後分析

第一步:

第二步:

檢視:

3.收集Java log4j生成的日誌檔案

1. 日誌檔案格式如下:

2020/08/13-13:09:09 [main] INFO  com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\xiangmu\icc-server\trunk\target\classes started by Administrator in E:\xiangmu\icc-server\trunk)
2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE

2. 編寫conf檔案logstash_file.conf,標準輸入輸出測試

input {
  stdin { }
}

filter {
  grok {
    match => {
      "message" => '%{DATESTAMP:time} \[%{WORD:threadName}\] %{WORD:logLevel} %{GREEDYDATA:syslog_message}'
    }
  }
  
  date {
    match => [ "time", "YYYY/MM/dd-HH:mm:ss" ]
    locale => en
  }  
}

output {
stdout {
 codec => rubydebug 
 }
}

測試如下:

$ head -n 2 /g/logs/test.log | ./bin/logstash -f ./config/logstash_file.conf
Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties
[2020-08-25T20:44:25,769][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-08-25T20:44:26,519][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
[2020-08-25T20:44:33,369][INFO ][org.reflections.Reflections] Reflections took 220 ms to scan 1 urls, producing 20 keys and 40 values
[2020-08-25T20:44:40,149][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-08-25T20:44:40,189][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_file.conf"], :thread=>"#<Thread:0x5f39f2d0 run>"}
[2020-08-25T20:44:43,482][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-08-25T20:44:43,692][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
        "@timestamp" => 0020-08-13T05:03:26.000Z,
              "time" => "20/08/13-13:09:09",
    "syslog_message" => " com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\\xiangmu\\icc-server\\trunk\\target\\classes started by Administrator in E:\\xiangmu\\icc-server\\trunk)\r",
          "@version" => "1",
           "message" => "2020/08/13-13:09:09 [main] INFO  com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\\xiangmu\\icc-server\\trunk\\target\\classes started by Administrator in E:\\xiangmu\\icc-server\\trunk)\r",
        "threadName" => "main",
              "host" => "root",
          "logLevel" => "INFO"
}
{
        "@timestamp" => 0020-08-13T05:03:26.000Z,
              "time" => "20/08/13-13:09:09",
    "syslog_message" => "com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE\r",
          "@version" => "1",
           "message" => "2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE\r",
        "threadName" => "main",
              "host" => "root",
          "logLevel" => "DEBUG"
}
[2020-08-25T20:44:45,492][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-08-25T20:44:45,902][INFO ][logstash.runner          ] Logstash shut down.

3. 修改logstash_file.conf檔案,讀取log檔案,同時存入redis生成索引

input {
    file {
        path => "G:/logs/test.log"
        type => "testfile"
        start_position => "beginning"
    }
}

filter {
    grok {
        match => {
            "message" => '%{DATESTAMP:time} \[%{WORD:threadName}\] %{WORD:logLevel} %{GREEDYDATA:syslog_message}'
        }
    }

    date {
        match => ["time", "YYYY/MM/dd-HH:mm:ss"]
        locale => en
    }
}

output {
    stdout {
        codec => rubydebug
    }
    elasticsearch {
        hosts => ["127.0.0.1:9200", "127.0.0.1:19200"] 
        index => "testfile-%{+YYYY.MM.dd}"
        template_overwrite => true
    }
}

執行如下:

liqiang@root MINGW64 /e/ELK/logstash-7.6.2
$ ./bin/logstash -f ./config/logstash_file.conf

4. kibana檢視索引欄位對映如下:

{
  "mapping": {
    "_doc": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "@version": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "host": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "logLevel": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "path": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "syslog_message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "threadName": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "time": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "type": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

總結:
  grok常用模式參考阿里