Graylog日誌監控集群搭建
由於公司是物聯網場景,設備每10秒都會上報Gps數據,後端猿對每個每個上傳數據的處理都加入了日誌,加上後端猿用Quartz寫了許多無狀態的後臺服務,每10s跑一次,而且有多個環境(開發,測試,線上環境)同時有日誌寫入,日誌量大概在均值3000/s, 峰值5000/s。做了rotation後,3天的日誌量在300G左右,最初的最小化安裝已經無法支撐。
順便分享下最小化安裝的docker-compose.yml 參考
version: ‘2‘
services:
MongoDB: https://hub.docker.com/_/mongo/
mongodb:
image: mongo:3
volumes:
- mongo_data:/data/db
Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docker.html
elasticsearch:
雖然這個img下載有些慢,但是請一定用這個
image: docker.elastic.co/elasticsearch/elasticsearch:5.6.3
volumes: - es_data:/usr/share/elasticsearch/data
environment: - http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
Disable X-Pack security: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/security-settings.html#general-security-settings
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
mem_limit: 1gGraylog: https://hub.docker.com/r/graylog/graylog/
graylog:
image: graylog/graylog:2.4.0-1
volumes: - graylog_journal:/usr/share/graylog/data/journal
environment:CHANGE ME!
- GRAYLOG_PASSWORD_SECRET=somepasswordpepper
Password: admin 生成方法:echo -n yourpassword | shasum -a 256
- GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
- GRAYLOG_WEB_ENDPOINT_URI=http://127.0.0.1:9000/api
links: - mongodb:mongo
- elasticsearch
depends_on: - mongodb
- elasticsearch
ports:Graylog web interface and REST API
- 9000:9000
Syslog TCP
- 514:514
Syslog UDP
- 514:514/udp
GELF TCP
- 12201:12201
GELF UDP
- 12201:12201/udp
Volumes for persisting data, see https://docs.docker.com/engine/admin/volumes/volumes/
volumes:
mongo_data:
driver: local
es_data:
driver: local
graylog_journal:
driver: local
使用心得
使用前,請務必看下Architectural,了解下設計理念。
? Graylog nodes should have a focus on CPU power. These also serve the user interface to the browser.
? Elasticsearch nodes should have as much RAM as possible and the fastest disks you can get. Everything depends on I/O speed here.
? MongoDB is storing meta information and configuration data and doesn’t need many resources.
簡單來說就是,Graylog用於采集日誌寫入Elasticsearch比較耗CPU,Elasticsearch比較耗RAM跟IO。知道這一點基本也就為後面調優確定了方向。
三節點搭建步驟 參考
準備工作(CentOS7.3)
yum –y update
yum install java-1.8.0-openjdk-headless.x86_64 #安裝JDK
MongoDB 3.6 replica set搭建
首先每個node安裝MongoDB- 添加MongoDB軟件源
/etc/yum.repos.d/mongodb-org-3.6.repo
[mongodb-org-3.6]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.6/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-3.6.asc - 更新源並安裝
yum update -y && yum install -y mongodb-org - 關鍵配置項
MongoDB數據放在非系統目錄
添加目錄權限 chown -R mongod:mongod /data/mongo
storage:
dbPath: /data/mongo
- 添加MongoDB軟件源
配置MongoDB repliSetName
replication:
replSetName: repl
配置bindIp,使用局域網Ip,保證該ip對於其他node可見
net:
port: 27017
bindIp: 192.168.168.242
- 啟動MongoDB服務
systemctl start mongod - 初始化MongoDB replica set
#start mongo shell and run
rs.initiate( {
_id : "rs0",
members: [
{ _id: 0, host: "mongodb0.example.net:27017" },
{ _id: 1, host: "mongodb1.example.net:27017" },
{ _id: 2, host: "mongodb2.example.net:27017" }
]
}) - 查看replica set配置
rs.conf() - 查看MongoDB replica set的Primary節點
rs.status()
Elasticsearch 5.6 集群搭建 - 導入Elasticsearch public GPG key
#/etc/yum.repos.d/elasticsearch.repo
[elasticsearch-5.x]
name=Elasticsearch repository for 5.x packages
baseurl=https://artifacts.elastic.co/packages/5.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md - Elasticsearch安裝
yum install elasticsearch - Elasticsearch集群配置
#/etc/elasticsearch/elasticsearch.yml集群各個instance的訪問ip
discovery.zen.ping.unicast.hosts: ["192.168.168.240", "192.168.168.241", "192.168.168.242"]
該節點的訪問ip
network.host: 192.168.168.242
集群名稱
cluster.name: ES_PROD
節點名稱
node.name: ${HOSTNAME}
數據目錄
mkdir -p /data/elasticsearch/data
chown -R elasticsearch:elasticsearch /data/elasticsearch
path.data: /data/elasticsearch/data
日誌目錄
mkdir -p /data/elasticsearch/logs
chown -R elasticsearch:elasticsearch /data/elasticsearch
path.logs: /data/elasticsearch/logs
- 編輯Elasticsearch系統服務
#刪除如下行
-Edefault.path.logs=${LOG_DIR} \
-Edefault.path.data=${DATA_DIR} \ - 啟動服務
systemctl start elasticsearch - 健康檢查
curl -XGET ‘http://192.168.168.240:9200/_cluster/state?pretty‘
Graylog 2.4 多節點搭建
Graylog多節點搭建只需要保留一個instance用於web訪問即可,日誌可以配置寫入不同node,當某個node壓力過大,需要自行考慮手動將部分日誌轉到其他空閑node。(這一層的負載均衡待研究) - Graylog 2.4 軟件源安裝
rpm -Uvh https://packages.graylog2.org/repo/packages/graylog-2.4-repository_latest.rpm - graylog-server安裝
yum install graylog-server - graylog多節點配置
多節點安轉時保留一個節點為true,其他節點為false
is_master = true
時區配置為上海時區
root_timezone = PRC
Password: admin 生成方法:echo -n yourpassword | shasum -a 256
password_secret = 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
多節點模式時,要保證該ip可被其他節點訪問
rest_listen_uri = http://192.168.168.242:9000/api/
默認同上
rest_transport_uri = http://192.168.168.242:9000/api/
配置外網訪問的Base URL,使用nginx反代理的話可以配置為http://192.168.168.242:9000/
web_listen_uri = http://121.196.213.107:50089/
Elasticsearch集群host列表(逗號分隔)
elasticsearch_hosts = http://192.168.168.240:9200,http://192.168.168.241:9200,http://192.168.168.242:9200
配置只保留72小時的日誌
elasticsearch_max_time_per_index = 1h
elasticsearch_max_number_of_indices = 72
MongoDB連接字符串
mongodb_uri = mongodb://192.168.168.242:27017,192.168.168.241:27017,192.168.168.240:27017/graylog?replicaSet=repl
- 調整Graylog JVM HeapSize
#/etc/sysconfig/graylog-server
GRAYLOG_SERVER_JAVA_OPTS="-Xms4g -Xmx4g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow" - 啟動服務
systemctl start graylog-server
Nginx反向代理配置 - Nginx安裝
yum install nginx -
Nginx配置
server
{
listen 50098 default_server;
listen [::]:50098 default_server ipv6only=on;
server_name 47.97.188.62:50098;location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Graylog-Server-URL http://$server_name/api;
proxy_pass http://192.168.168.242:9000;
}
}
後期優化可以考慮的點 - 調大cpu,可增加消息處理速度
- 多開幾個udp端口,可增加消息處理速度
- 調大Elasticsearch內存占用,增加緩沖,降低IO
- 更換更高效的磁盤
Graylog日誌監控集群搭建