39.0 Elastic Stack介紹
文章目錄
Elastic Stack介紹
回顧:(redis)
分散式系統兩個基礎理論:CAP/BASE
CAP:AP
C、A、P:三者其中之二;
AP:可用性、分割槽容錯性、弱一致性;
BASE:BA,S,E
BA:基本可用、S:軟狀態、E:最終一致性;
分散式系統:
分散式儲存:
NoSQL:
kv、document、column families、GraphDB
分散式檔案系統:檔案系統介面
分散式儲存:API,不能掛載;ceph, glusterfs, HDFS…
NewSQL:
PingCAP:TiDB(MySQL protocol)…
分散式運算:mapreduce, …
HADOOP=mapreduce+HDFS, HBase(Hadoop Database)
redis:REmote DIctionary Server
資料結構:String, List, Set, sorted_set, Hash, pubsub …
MySQL(MyISAM),Redis(k/v)
MyISAM:支援全文索引Fulltext
B+Tree:最左字首索引
網際網路搜尋引擎面臨的兩大問題:
資料處理
資料儲存
google引擎:
gfs:海量資料儲存系統
mapreduce:分散式應用處理程式框架 概念"Map(對映)“和"Reduce(歸約)”
Hadoof
根據google研究搜尋引擎後發表的論文而山寨的開源搜尋引擎中的資料處理和資料儲存
mapreduce:分散式應用處理框架
HDFS:檔案儲存系統
倒排索引
關鍵詞找文件
Lucene:java開發的搜尋引擎類庫
https://baike.baidu.com/item/Lucene/6753302?fr=aladdin
ASF旗下產品
etl:抽取(extract),裝入,轉換(transform)
shard:切片
Document:相當於關係型資料庫的row 如{title:BODY,ident:ID,user:USER}
type:相當於關係型資料庫的table
index:相當於關係型資料庫的database
lucene構建索引,提供搜尋功能,但不提供搜尋介面,無法與使用者互動
可以給lucene開發一個外殼,通過套接字和lucene互動,外殼負責把使用者要查詢的資料轉為特定格式的查詢操作,交給lucene處理
Sphinx:c++開發,搜尋引擎類庫
埃及師身人面像
Solr:搜尋引擎伺服器
ASF旗下產品,只是搜尋引擎伺服器,資料來源沒有
資料來源:可以通過網路爬蟲爬遍網路搜尋網址,轉換為特定格式,匯入搜尋引擎的儲存資料庫,給客戶提供api介面,以便使用者檢索,查詢。站內搜尋的話,可以從資料庫查詢資料
設計初衷是單機的,可以看做lucene的外殼
elastic search
實現日誌儲存,分析,檢索的系統
多級系統分散式執行,可以看做lucene的外殼,其核心功能是lucene
支援單播,不再支援多播
Shard:分片
百科
lucene是把搜尋到的資料切片為json格式
elosatic的分片和redis一樣,二級路由,作為分散式規劃,不再對節點取模,而是想redis一樣,對一個固定資料取模16384個slot。這種模式為二級分佈或者二級路由:一箇中心節點記錄所有的key所在的分散式叢集中的節點號,則value即可取到,或者沒有無中心節點,所有節點都是中心節點;每一個分片在每一個節點都可以作為一個獨立的索引(非關係型)
Logstash
:工作於生成日誌的伺服器上的程序,抓取日誌傳給elastic search,重量級,消耗記憶體,至少幾百兆的記憶體。etl工具,RESTful風格
filebeat
是另一個日誌抽取工具,效率高,佔用記憶體只有幾兆,filebeat負責抓取日誌,logstash負責轉換
kibana
:支援elastic search介面的展示頁面,視覺化工具,類似grafana
elfk稱為elastic stack
圖11和圖2
圖3 ES ARCH
https://www.elastic.co/downloads/past-releases
[[email protected] ~ ]#vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.31.7 node01
192.168.31.17 node02
192.168.31.27 node03
192.168.31.37 node04
其它同理
[[email protected] ~ ]#lftp 172.18.0.1
lftp 172.18.0.1:~> cd pub/Sources/7.x86_64/elasticstack/
[[email protected] ~ ]#yum install java-1.8.0-openjdk-devel -y
[[email protected] ~ ]#rpm -ivh elasticsearch-5.6.8.rpm
[[email protected] ~ ]#vim /etc/elasticsearch/elasticsearch.yml
cluster.name: myels
node.name: node01
path.data: /els/data
path.logs: /els/logs
network.host: 192.168.31.7
http.port: 9200
discovery.zen.ping.unicast.hosts: ["node01", "node02","node03"]
discovery.zen.minimum_master_nodes: 1
[[email protected] ~ ]#vim /etc/elasticsearch/jvm.options
# Xmx represents the maximum size of total heap space
-Xms1g #elastic seartch要求這兩個選項大小一樣
-Xmx1g
[[email protected] ~ ]#rpm -ivh elasticsearch-5.6.8.rpm
[[email protected] ~ ]#vim /etc/elasticsearch/jvm.options
[[email protected] ~ ]#rpm -ivh elasticsearch-5.6.8.rpm
[[email protected] ~ ]#vim /etc/elasticsearch/jvm.options
[[email protected] ~ ]#mkdir -p /els/{data,logs} && chown elasticsearch.elasticsearch /els/*
[[email protected] ~ ]#mkdir -p /els/{data,logs} && chown elasticsearch.elasticsearch /els/*
[[email protected] ~ ]#mkdir -p /els/{data,logs} && chown elasticsearch.elasticsearch /els/*
[[email protected] ~ ]#systemctl start elasticsearch.service
[[email protected] ~ ]#systemctl start elasticsearch.service
[[email protected] ~ ]#systemctl start elasticsearch.service
[[email protected] ~ ]#ss -ntl
LISTEN 0 128 ::ffff:192.168.31.7:9200 #提供服務
LISTEN 0 128 ::ffff:192.168.31.7:9300 #內部協調節點
啟動較慢
filebeat:收集日誌
[[email protected] ~ ]#curl http://node01:9200/
{
“name” : “node01”,
“cluster_name” : “myels”,
“cluster_uuid” : “GhLswAecRuyP55L-6a6lZw”,
“version” : {
“number” : “5.6.8”,
“build_hash” : “688ecce”,
“build_date” : “2018-02-16T16:46:30.010Z”,
“build_snapshot” : false,
“lucene_version” : “6.6.1”
},
“tagline” : “You Know, for Search”
}
[[email protected] ~ ]#curl http://node01:9200/_cat/nodes
192.168.31.7 19 92 0 0.00 0.01 0.05 mdi * node01
192.168.31.17 7 95 0 0.00 0.01 0.10 mdi - node02
192.168.31.27 6 95 0 0.05 0.03 0.06 mdi - node03
[[email protected] ~ ]#curl http://node01:9200/_cat/health
1537516549 15:55:49 myels green 3 3 10 5 0 0 0 0 - 100.0%
[[email protected] ~ ]#curl http://node01:9200/_cat
[[email protected] ~ ]#curl http://node01:9200/myindex/students/1?pretty=true
預設get方法,結果為error,?pretty=true是以一定的格式顯示
{
"error" : {
"root_cause" : [
{
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_expression",
"resource.id" : "myindex",
"index_uuid" : "_na_",
"index" : "myindex"
}
],
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_expression",
"resource.id" : "myindex",
"index_uuid" : "_na_",
"index" : "myindex"
},
"status" : 404
}
[[email protected] ~ ]#curl -XPUT http://node01:9200/myindex/students/1 -d '{"name":"dhy","age":18,"song":"rose"}'
{"_index":"myindex","_type":"students","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"created":true}
#兩個分片
[[email protected] ~ ]#curl http://node01:9200/myindex/students/1?pretty
{
"_index" : "myindex",
"_type" : "students",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "dhy",
"age" : 18,
"song" : "rose"
}
}
[[email protected] ~ ]#curl -XGET http://node01:9200/_cat/indices
green open myindex Da2GRrF0RI2o6x5MLNyiyQ 5 1 1 0 9.5kb 4.7kb
[[email protected] ~ ]#curl -XGET http://node01:9200/_cat/shards
myindex 4 p STARTED 0 162b 192.168.31.17 node02
myindex 4 r STARTED 0 162b 192.168.31.7 node01
myindex 3 r STARTED 1 4.2kb 192.168.31.17 node02
myindex 3 p STARTED 1 4.2kb 192.168.31.7 node01
myindex 1 r STARTED 0 162b 192.168.31.27 node03
myindex 1 p STARTED 0 162b 192.168.31.7 node01
myindex 2 p STARTED 0 162b 192.168.31.17 node02
myindex 2 r STARTED 0 162b 192.168.31.27 node03
myindex 0 p STARTED 0 162b 192.168.31.17 node02
myindex 0 r STARTED 0 162b 192.168.31.27 node03
#10個分片,5主5副,一對主副不會再相同的節點上,資料分片
[[email protected] ~ ]#yum install jq -y
#json工具
[[email protected] ~ ]#curl -s -XGET http://node01:9200/_search?q=song:rose |jq .
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "myindex",
"_type": "students",
"_id": "1",
"_score": 0.2876821,
"_source": {
"name": "dhy",
"age": 18,
"song": "rose"
}
}
]
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/index.html
ETL工具的使用
把資料抽取,轉換,裝入elastic search中
Logstash工具 官網
Beats工具
tomcat伺服器上可以裝佔用記憶體較少的Beats工具,包含filebeat
官網
各種外掛
input–> filter–>output
眾多的tomcat伺服器,都連線到elastic search上,對elastic search伺服器來說,連線過多,可以找一箇中間件,把tomcat上的beats都發送到中介軟體,由中介軟體統一發送到elastic serarch
[[email protected] ~ ]#yum install httpd -y
[[email protected] ~ ]#for i in {1..20};do echo "Page $i" > /var/www/html/test$i.html;done
[[email protected] ~ ]#vim /etc/httpd/conf/httpd.conf
LogFormat "%{X-Forwarded}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
[[email protected] ~ ]#systemctl start httpd
模擬客戶端工具curl -H
[[email protected] ~ ]#curl -H x-forwarded:1.1.1.1 http://192.168.31.47/test1.html
[[email protected] html ]#tail -f /var/log/httpd/access_log
1.1.1.1 - - [21/Sep/2018:16:06:47 +0800] "GET /test1.html HTTP/1.1" 200 7 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
[[email protected] ~ ]#while true;do curl -H "X-forwarded:$[$RANDOM%222+1].$[$RANDOM%225].$[$RANDOM%118].1" http://192.168.31.47/test$[$RANDOM%20+1].html;sleep .5;done
[[email protected] html ]#tail -f /var/log/httpd/access_log
120.120.33.1 - - [21/Sep/2018:16:11:30 +0800] "GET /test6.html HTTP/1.1" 200 7 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
[[email protected] ~ ]#yum install java-1.8.0-openjdk -y
[[email protected] ~ ]#rpm -ivh logstash-5.6.8.rpm
[[email protected] ~ ]#rpm -ql logstash | wc -l
11293
[[email protected] ~ ]#ls /etc/logstash/
conf.d/ jvm.options log4j2.properties logstash.yml startup.options
[[email protected] ~ ]#rpm -ql logstash |grep logstash$
/usr/share/logstash/bin/logstash
/var/lib/logstash
/var/log/logstash
[[email protected] ~ ]#vim /etc/profile.d/logstash.sh
export PATH=/usr/share/logstash/bin/:$PATH
[[email protected] ~ ]#exec bash
[[email protected] ~ ]#logstash -h
較慢
[[email protected] ~ ]#cd /etc/logstash/conf.d
[[email protected] conf.d ]#vim stdin-out.conf
input {
stdin{}
}
output {
stdout{
codec => rubydebug #此行刪除,則是標準輸出,此處為json編碼
}
}
logstash文件
https://www.elastic.co/guide/en/logstash/5.6/index.html
[[email protected] conf.d ]#logstash -f stdin-out.conf -t
#語法測試
[[email protected] conf.d ]#logstash -f stdin-out.conf
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
The stdin plugin is now waiting for input:
dhy
2018-09-21T08:27:08.200Z cos47.localdomain dhy
elastsh要求只能存入json格式的文件,如下
[[email protected] conf.d ]#logstash -f stdin-out.conf
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
The stdin plugin is now waiting for input:
dhy hello
{
"@version" => "1",
"host" => "cos47.localdomain",
"@timestamp" => 2018-09-21T08:33:06.781Z,
"message" => "dhy hello"
}
https://www.elastic.co/guide/en/logstash/5.6/plugins-inputs-file.html
讓logstash從檔案讀取資料
[[email protected] conf.d ]#vim file-out.conf
input{
file{
path => ["/var/log/httpd/access_log"]
start_position => "beginning"
}
}
output {
stdout {
codec => rubydebug
}
}
[[email protected] conf.d ]#logstash -f file-out.conf -t
#語法檢查
[[email protected] conf.d ]#logstash -f file-out.conf
此處輸出的資料格式太亂,可以用logstash進行過濾匹配轉換
官網 https://www.elastic.co/guide/en/logstash/5.6/plugins-filters-grok.html
而系統內建的也有,模式不用自己寫了,可以參考
[[email protected] conf.d ]#rpm -ql logstash | grep patterns
[[email protected] conf.d ]#less /usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.2/patterns/grok-patterns
[[email protected] conf.d ]#cat /usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.2/patterns/httpd
HTTPDUSER %{EMAILADDRESS}|%{USER}
HTTPDERROR_DATE %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}
# Log formats
HTTPD_COMMONLOG %{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)
HTTPD_COMBINEDLOG %{HTTPD_COMMONLOG} %{QS:referrer} %{QS:agent}
# Error logs
HTTPD20_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{LOGLEVEL:loglevel}\] (?:\[client %{IPORHOST:clientip}\] ){0,1}%{GREEDYDATA:message}
HTTPD24_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{POSINT:pid}(:tid %{NUMBER:tid})?\]( \(%{POSINT:proxy_errorcode}\)%{DATA:proxy_message}:)?( \[client %{IPORHOST:clientip}:%{POSINT:clientport}\])?( %{DATA:errorcode}:)? %{GREEDYDATA:message}
HTTPD_ERRORLOG %{HTTPD20_ERRORLOG}|%{HTTPD24_ERRORLOG}
# Deprecated
COMMONAPACHELOG %{HTTPD_COMMONLOG}
COMBINEDAPACHELOG %{HTTPD_COMBINEDLOG}
https://www.elastic.co/guide/en/logstash/5.6/plugins-filters-grok.html#plugins-filters-grok-match
[[email protected] conf.d ]#vim file-out.conf
input{
file{
path => ["/var/log/httpd/access_log"]
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
}
}
output {
stdout {
codec => rubydebug
}
}
[[email protected] conf.d ]#logstash -f file-out.conf -t
{
"request" => "/test7.html",
"agent" => "\"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\"",
"auth" => "-",
"ident" => "-",
"verb" => "GET",
"message" => "114.192.25.1 - - [21/Sep/2018:16:57:30 +0800] \"GET /test7.html HTTP/1.1\" 200 7 \"-\" \"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\"",
"path" => "/var/log/httpd/access_log",
"referrer" => "\"-\"",
"@timestamp" => 2018-09-21T08:57:31.304Z,
"response" => "200",
"bytes" => "7",
"clientip" => "114.192.25.1",
"@version" => "1",
"host" => "cos47.localdomain",
"httpversion" => "1.1",
"timestamp" => "21/Sep/2018:16:57:30 +0800"
}
[[email protected] conf.d ]#vim file-out.conf
input{
file{
path => ["/var/log/httpd/access_log"]
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
remove_field => "message"
}
date {
match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]
remove_field => "timestamp"
}
}
output {
stdout {
codec => rubydebug
}
}
移除了timestamp,並且把@timestamp的時間(logstash時間)替換為了timestamp的時間(日誌生成時間)
[[email protected] conf.d ]#logstash -f file-out.conf -t
{
"request" => "/test11.html",
"agent" => "\"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\"",
"auth" => "-",
"ident" => "-",
"verb" => "GET",
"path" => "/var/log/httpd/access_log",
"referrer" => "\"-\"",
"@timestamp" => 2018-09-21T09:10:52.000Z,
"response" => "200",
"bytes" => "8",
"clientip" => "154.121.23.1",
"@version" => "1",
"host" => "cos47.localdomain",
"httpversion" => "1.1"
}
geoip的過濾器外掛,做地理位置的歸屬分佈
根據ip查詢ip所在地和經緯度,有一個數據庫,geoip會讀取地址到地址庫查詢,maxmind地址庫
https://www.elastic.co/guide/en/logstash/5.6/plugins-filters-geoip.html
The GeoIP filter adds information about the geographical location of IP addresses, based on data from the Maxmind GeoLite2 databases.
https://dev.maxmind.com/geoip/geoip2/geolite2/ 下載maxmind地址庫
[[email protected] conf.d ]#wget http://geolite.maxmind.com/download/geoip/database/GeoLite2-City.tar.gz
此地址位置庫會每週更新,由於ip地址購買者或者到期的ip都會重新分配,需要做一個軟連結,定期把庫更新了
[[email protected] conf.d ]#mkdir /etc/logstash/maxmind/
[[email protected] conf.d ]#mv GeoLite2-City.tar.gz /etc/logstash/maxmind/
[[email protected] conf.d ]#cd /etc/logstash/maxmind/
[[email protected] maxmind ]#tar xf GeoLite2-City.tar.gz
[[email protected] maxmind ]#ls
GeoLite2-City_20180911 GeoLite2-City.tar.gz
[[email protected] maxmind ]#cd GeoLite2-City_20180911/
[[email protected] GeoLite2-City_20180911 ]#ls
COPYRIGHT.txt GeoLite2-City.mmdb LICENSE.txt README.txt
[[email protected] maxmind ]#ln -sv GeoLite2-City
GeoLite2-City_20180911/ GeoLite2-City.tar.gz
[[email protected] maxmind ]#ln -sv GeoLite2-City_20180911/GeoLite2-City.mmdb ./
‘./GeoLite2-City.mmdb’ -> ‘GeoLite2-City_20180911/GeoLite2-City.mmdb’
[[email protected] maxmind ]#ll
total 26520
drwxr-xr-x 2 2000 2000 90 Sep 12 05:17 GeoLite2-City_20180911
lrwxrwxrwx 1 root root 41 Sep 21 17:35 GeoLite2-City.mmdb -> GeoLite2-City_20180911/GeoLite2-City.mmdb
-rw-r--r-- 1 root root 27154441 Sep 12 05:17 GeoLite2-City.tar.gz
[[email protected] maxmind ]#cd /etc/logstash/conf.d
[[email protected] conf.d ]#logstash -f file-out.conf -t
[[email protected] conf.d ]#logstash -f file-out.conf
{
"request" => "/test3.html",
"agent" => "\"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\"",
"geoip" => {
"ip" => "134.182.57.1",
"latitude" => 37.751,
"country_name" => "United States",
"country_code2" => "US",
"continent_code" => "NA",
"country_code3" => "US",
"location" => {
"lon" => -97.822,
"lat" => 37.751
},
"longitude" => -97.822
},
"auth" => "-",
"ident" => "-",
"verb" => "GET",
"path" => "/var/log/httpd/access_log",
"referrer" => "\"-\"",
"@timestamp" => 2018-09-21T09:39:40.000Z,
"response" => "200",
"bytes" => "7",
"clientip" => "134.182.57.1",
"@version" => "1",
"host" => "cos47.localdomain",
"httpversion" => "1.1"
}
[[email protected] conf.d ]#cat file-out.conf
input{
file{
path => ["/var/log/httpd/access_log"]
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
remove_field => "message"
}
date {
match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]
remove_field => "timestamp"
}
geoip {
source => "clientip"
target => "geoip"
database => "/etc/logstash/maxmind/GeoLite2-City.mmdb"
}
}
output {
stdout {
codec => rubydebug
}
}
切割欄位
https://www.elastic.co/guide/en/logstash/5.6/plugins-filters-mutate.html
The mutate filter allows you to perform general mutations on fields. You can rename, remove, replace, and modify fields in your events.
輸出外掛
https://www.elastic.co/guide/en/logstash/5.6/index.html
Output plugins
Elasticsearch output plugin
File output plugin
Kafka output plugin
Redis output plugin
...
https://www.elastic.co/guide/en/logstash/5.6/plugins-outputs-elasticsearch.html
[[email protected] conf.d ]#vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.31.7 node01
192.168.31.17 node02
192.168.31.27 node03
#修改輸出過濾器,如下
[[email protected] conf.d ]#vim file-out.conf
output {
elasticsearch {
hosts => ["http://node01:9200/","http://node02:9200/","http://node03:9200/"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "httpd_access_logs"
}
}
[[email protected] ~ ]#curl http://node01:9200/_cat/indices
green open logstash-2018.09.21 n91hh_W9REiF923QYVh7Zw 5 1 1886 0 2.9mb 1.6mb
green open myindex Da2GRrF0RI2o6x5MLNyiyQ 5 1 1 0 9.7kb 4.8kb
[[email protected] html ]#tail -f /var/log/httpd/access_log
27.118.110.1 - - [21/Sep/2018:17:59:33 +0800] “GET /test8.html HTTP/1.1” 200 7 “-” “curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2”
[[email protected] ~ ]#curl http://node01:9200/logstash-*/_search?q=clientip:149.150.15.1 |jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 998 100 998 0 0 69863 0 --:--:-- --:--:-- --:--:-- 71285
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 6.0606794,
"hits": [
{
"_index": "logstash-2018.09.21",
"_type": "httpd_access_logs",
"_id": "AWX7lMjHyF1gIEU9hupG",
"_score": 6.0606794,
"_source": {
"request": "/test2.html",
"agent": "\"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\"",
"geoip": {
"timezone": "America/New_York",
"ip": "149.150.15.1",
"latitude": 40.749,
"continent_code": "NA",
"city_name": "South Orange",
"country_name": "United States",
"country_code2": "US",
"dma_code": 501,
"country_code3": "US",
"region_name": "New Jersey",
"location": {
"lon": -74.2639,
"lat": 40.749
},
"postal_code": "07079",
"region_code": "NJ",
"longitude": -74.2639
},
"auth": "-",
"ident": "-",
"verb": "GET",
"path": "/var/log/httpd/access_log",
"referrer": "\"-\"",
"@timestamp": "2018-09-21T10:02:35.000Z",
"response": "200",
"bytes": "7",
"clientip": "149.150.15.1",
"@version": "1",
"host": "cos47.localdomain",
"httpversion": "1.1"
}
}
]
}
}
從filebeat輸入
回顧一下上面
node01,node02,node03上裝的是elastic search服務,演示elastic search的服務
cos47上裝的httpd服務,和logstash服務(logstash基於java開發,需要安裝jdk),演示了logstash作為ETL工具的抽取(即stdin {}),轉換(即filter {}),裝入功能(output {})
上面是從cos47的httpd服務的日誌作為標準輸入,實際應該用beats家族的filebeat抽取web服務的日誌,作為logstash服務的標準輸入,拿現在把filebeat也安裝到cos47機器上(僅實驗),從filebeat獲取logstash標準輸入需要的資料,由logstash進行資料格式轉換
[[email protected] ~ ]#rpm -ivh filebeat-5.6.8-x86_64.rpm
[[email protected] ~ ]#rpm -ql filebeat
/etc/filebeat/filebeat.full.yml
/etc/filebeat/filebeat.template-es2x.json
/etc/filebeat/filebeat.template-es6x.json
/etc/filebeat/filebeat.template.json
/etc/filebeat/filebeat.yml
/usr/share/filebeat/bin/filebeat
[[email protected] filebeat ]#vim filebeat.yml
- /var/log/httpd/access_log
#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
# Array of hosts to connect to.
#hosts: ["localhost:9200"]
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["192.168.31.47:5044"]
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html
[[email protected] conf.d ]#pwd
/etc/logstash/conf.d
修改輸入為filebeat的埠
[[email protected] conf.d ]#vim file-out.conf
input{
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
remove_field => "message"
}
date {
match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]
remove_field => "timestamp"
}
geoip {
source => "clientip"
target => "geoip"
database => "/etc/logstash/maxmind/GeoLite2-City.mmdb"
}
}
output {
elasticsearch {
hosts => ["http://node01:9200/","http://node02:9200/","http://node03:9200/"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "httpd_access_logs"
}
}
[[email protected] conf.d ]#logstash -f file-out.conf
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2018-09-21 18:35:30.813 [[main]<beats] Server - Starting server on port: 5044
[[email protected] html ]#ss -ntl
#監聽5044
LISTEN 0 128 :::5044
[[email protected] ~ ]#systemctl start filebeat
[[email protected] ~ ]#ps aux
root 9306 0.5 0.9 23904 15172 ? Ssl 18:45 0:00 /usr/share/filebeat/bin/filebeat -c /etc/f
[[email protected] ~ ]#curl http://node01:9200/logstash-*/_search?q=response:404 | jq .
****省略****
],
"referrer": "\"-\"",
"@timestamp": "2018-09-21T08:10:10.000Z",
"response": "404",
"bytes": "208",
"clientip": "132.156.12.1",
"@version": "1",
"host": "cos47.localdomain",
"httpversion": "1.1"
}
}
]
}
}
加入redis
[[email protected] ~ ]#systemctl stop filebeat
[[email protected] ~ ]#yum install redis
[[email protected] ~ ]#vim /etc/redis.conf
bind 0.0.0.0
requirepass dhy.com
[[email protected] ~ ]#systemctl start redis
[[email protected] ~ ]#ss -ntl
#6379埠
[[email protected] ~ ]#vim /etc/filebeat/filebeat.full.yml
[[email protected] ~ ]#vim /etc/filebeat/filebeat.yml
#------------------------------- Redis output ----------------------------------
output.redis:
enabled: true
hosts: ["192.168.31.47:6379"]
port: 6379
key: filebeat
db: 0
password: dhy.com
datatype: list
關閉logstash output
[[email protected] ~ ]#systemctl restart filebeat
[[email protected] filebeat ]#tail /var/log/filebeat/filebeat
2018-09-22T19:41:09+08:00 INFO Loading registrar data from /var/lib/filebeat/registry
2018-09-22T19:41:09+08:00 INFO States Loaded from registrar: 1
2018-09-22T19:41:09+08:00 INFO Loading Prospectors: 1
2018-09-22T19:41:09+08:00 INFO Starting Registrar
2018-09-22T19:41:09+08:00 INFO Start sending events to output
2018-09-22T19:41:09+08:00 INFO Prospector with previous states loaded: 1
2018-09-22T19:41:09+08:00 INFO Starting prospector of type: log; id: 14835892602573179500
2018-09-22T19:41:09+08:00 INFO Loading and starting Prospectors completed. Enabled prospectors: 1
2018-09-22T19:41:09+08:00 INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2018-09-22T19:41:09+08:00 INFO Harvester started for file: /var/log/httpd/access_log
[[email protected] filebeat ]#redis-cli -a dhy.com
127.0.0.1:6379> KEYS *
1) "filebeat"
127.0.0.1:6379> LINDEX filebeat 0
"{\"@timestamp\":\"2018-09-22T11:41:09.956Z\",\"beat\":{\"hostname\":\"cos47.localdomain\",\"name\":\"cos47.localdomain\",\"version\":\"5.6.8\"},\"input_type\":\"log\",\"message\":\"48.59.26.1 - - [22/Sep/2018:19:34:14 +0800] \\\"GET /test12.html HTTP/1.1\\\" 200 8 \\\"-\\\" \\\"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\\\"\",\"offset\":3713292,\"source\":\"/var/log/httpd/access_log\",\"type\":\"log\"}"
https://www.elastic.co/guide/en/logstash/5.6/plugins-inputs-redis.html
#修改輸入外掛為redis
[[email protected] conf.d ]#vim file-out.conf
input{
redis {
host => "192.168.31.47"
port => 6379
password => "dhy.com"
db => 0
key => "filebeat"
data_type => "list"
}
}
[[email protected] conf.d ]#logstash -f file-out.conf -t
[[email protected] filebeat ]#redis-cli -a dhy.com
127.0.0.1:6379> LLEN filebeat
(integer) 1694
127.0.0.1:6379> LLEN filebeat
(integer) 1773
[[email protected] conf.d ]#logstash -f file-out.conf
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
#停留於此
[[email protected] filebeat ]#redis-cli -a dhy.com
127.0.0.1:6379> LLEN filebeat
(integer) 0
#列表資料被logstash讀走
https://www.elastic.co/guide/en/logstash/5.6/event-dependent-configuration.html
if EXPRESSION {
…
} else if EXPRESSION {
…
} else {
…
}
https://discuss.elastic.co/t/how-to-tag-log-files-in-filebeat-for-logstash-ingestion/44713/3
[[email protected] ~ ]#vim /etc/filebeat/filebeat.yml
Each - is a prospector. Most options can be set at the prospector level
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/httpd/access_log
fields:
log_type: access
- paths:
-/var/log/httpd/error_log
fields:
log_type: errors
[[email protected] filebeat ]#redis-cli -a dhy.com
127.0.0.1:6379> LLEN filebeat
(integer) 4313
127.0.0.1:6379> LINDEX filebeat 4313
"{\"@timestamp\":\"2018-09-22T13:21:15.457Z\",\"beat\":{\"hostname\":\"cos47.localdomain\",\"name\":\"cos47.localdomain\",\"version\":\"5.6.8\"},\"fields\":{\"log_type\":\"access\"},\"input_type\":\"log\",\"message\":\"5.99.56.1 - - [22/Sep/2018:21:21:14 +0800] \\\"GET /test13.html HTTP/1.1\\\" 200 8 \\\"-\\\" \\\"curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2\\\"\",\"offset\":5445561,\"source\":\"/var/log/httpd/access_log\",\"type\":\"log\"}"
,“fields”:{“log_type”:“access”}
[[email protected] filebeat ]#systemctl restart filebeat
[[email protected] conf.d ]#vim file-out.conf
input{
redis {
host => "192.168.31.47"
port => 6379
password => "dhy.com"
db => 0
key => "filebeat"
data_type => "list"
}
}
filter {
if [fields][log_type] == "access" {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
remove_field => ["message","beat"]
}
date {
match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]
remove_field => "timestamp"
}
geoip {
source => "clientip"
target => "geoip"
database => "/etc/logstash/maxmind/GeoLite2-City.mmdb"
}
}
}
output {
if [fields][log_type] == "access" {
elasticsearch {
hosts => ["http://node01:9200/","http://node02:9200/","http://node03:9200/"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "httpd_access_logs"
}
} else {
elasticsearch {
hosts => ["http://node01:9200/","http://node02:9200/","http://node03:9200/"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "httpd_error_logs"
}
}
}
[[email protected] conf.d ]#logstash -f file-out.conf -t
[[email protected] conf.d ]#systemctl start logstash
[[email protected] conf.d ]#systemctl status logstash
[[email protected] conf.d ]#ps aux
logstash 3651 31.3 27.4 3757668 419288 ? SNsl 20:32 0:21 /usr/bin/java -XX:+UseParNewGC -XX:+UseCon
kibana
[[email protected] ~ ]#rpm -ivh kibana-5.6.8-x86_64.rpm
server.port: 5601
server.host: “0.0.0.0” #本機所有地址
server.name: “node04”
elasticsearch.url: “http://192.168.31.17:9200”
elasticsearch.preserveHost: true
kibana.index: “.kibana”
[[email protected] ~ ]#systemctl start kibana
[[email protected] ~ ]#ss -ntl
LISTEN 0 128 *:5601
如圖
瀏覽器過濾條件
response:200 OR response:404
agent:curl
參考:
ELASTICSEARCH 選主流程 https://www.easyice.cn/archives/164
ElasticSearch的基本原理與用法 https://www.cnblogs.com/luxiaoxun/p/4869509.html