1. 程式人生 > >在k8s上部署日誌系統elfk

在k8s上部署日誌系統elfk

### 日誌系統elfk #### 前言 經過上週的技術預研,在本週一通過開會研究,根據公司的現有業務流量和技術棧,決定選擇的日誌系統方案為:elasticsearch(es)+logstash(lo)+filebeat(fi)+kibana(ki)組合。es選擇使用aliyun提供的es,lo&fi選擇自己部署,ki是阿里雲送的。因為申請ecs需要一定的時間,暫時選擇部署在測試&生產環境(吐槽一下,我司測試和生產公用一套k8s並且託管與aliyun......)。用時一天(前期有部署的差不多過)完成在kubernetes上部署完成elfk(先部署起來再說,優化什麼的後期根據需要再搞)。 #### 元件簡介 es 是一個實時的、分散式的可擴充套件的搜尋引擎,允許進行全文、結構化搜尋,它通常用於索引和搜尋大量日誌資料,也可用於搜尋許多不同型別的文。 lo 主要的有點就是它的靈活性,主要因為它有很多外掛,詳細的文件以及直白的配置格式讓它可以在多種場景下應用。我們基本上可以在網上找到很多資源,幾乎可以處理任何問題。 作為 Beats 家族的一員,fi 是一個輕量級的日誌傳輸工具,它的存在正彌補了 lo 的缺點fi作為一個輕量級的日誌傳輸工具可以將日誌推送到中心lo。 ki是一個分析和視覺化平臺,它可以瀏覽、視覺化儲存在es叢集上排名靠前的日誌資料,並構建儀表盤。ki結合es操作簡單集成了絕大多數es的API,是專業的日誌展示應用。 #### 資料採集流程圖 ![img](https://leanote.com/api/file/getImage?fileId=5d3f9e37ab6441734a00607f) 日誌流向:logs_data---> fi ---> lo ---> es---> ki。 logs_data通過fi收集日誌,輸出到lo,通過lo做一些過濾和修改之後傳送到es資料庫,ki讀取es資料庫做分析。 #### 部署 根據我司的實際叢集狀況,此文件部署將完全還原日誌系統的部署情況。 ![](https://img2020.cnblogs.com/blog/1464583/202006/1464583-20200610134820769-1740421235.png) ##### 在本地MAC安裝kubectl連線aliyun託管k8s 在客戶端(隨便本地一臺虛機上)安裝和託管的k8s一樣版本的kubectl ```shell curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.14.8/bin/linux/amd64/kubectl chmod +x ./kubectl mv ./kubectl /usr/local/bin/kubectl 將阿里雲託管的k8s的kubeconfig 複製到$HOME/.kube/config 目錄下,注意使用者許可權的問題 ``` ##### 部署ELFK 申請一個名稱空間(一般一個專案一個名稱空間)。 ```shell # cat kube-logging.yaml apiVersion: v1 kind: Namespace metadata: name: loging ``` 部署es。網上找個差不多的資源清單,根據自己的需求進行適當的修改,執行,出錯就根據日誌進行再修改。 ```shell # cat elasticsearch.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: local-class namespace: loging provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer # Supported policies: Delete, Retain reclaimPolicy: Delete --- kind: PersistentVolume apiVersion: v1 metadata: name: datadir1 namespace: logging labels: type: local spec: storageClassName: local-class capacity: storage: 5Gi accessModes: - ReadWriteOnce hostPath: path: "/data/data1" --- apiVersion: apps/v1 kind: StatefulSet metadata: name: elasticsearch namespace: loging spec: serviceName: elasticsearch selector: matchLabels: app: elasticsearch template: metadata: labels: app: elasticsearch spec: containers: - name: elasticsearch image: elasticsearch:7.3.1 resources: limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 9200 name: rest protocol: TCP - containerPort: 9300 name: inter-node protocol: TCP volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data env: - name: "discovery.type" value: "single-node" - name: cluster.name value: k8s-logs - name: node.name valueFrom: fieldRef: fieldPath: metadata.name - name: ES_JAVA_OPTS value: "-Xms512m -Xmx512m" initContainers: - name: fix-permissions image: busybox command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"] securityContext: privileged: true volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data - name: increase-vm-max-map image: busybox command: ["sysctl", "-w", "vm.max_map_count=262144"] securityContext: privileged: true - name: increase-fd-ulimit image: busybox command: ["sh", "-c", "ulimit -n 65536"] securityContext: privileged: true volumeClaimTemplates: - metadata: name: data labels: app: elasticsearch spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "local-class" resources: requests: storage: 5Gi --- kind: Service apiVersion: v1 metadata: name: elasticsearch namespace: loging labels: app: elasticsearch spec: selector: app: elasticsearch clusterIP: None ports: - port: 9200 name: rest - port: 9300 name: inter-node ``` 部署ki。因為根據資料採集流程圖,ki是和es結合的,配置相對簡單。 ```go # cat kibana.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: kibana namespace: loging labels: k8s-app: kibana spec: replicas: 1 selector: matchLabels: k8s-app: kibana template: metadata: labels: k8s-app: kibana spec: containers: - name: kibana image: kibana:7.3.1 resources: limits: cpu: 1 memory: 500Mi requests: cpu: 0.5 memory: 200Mi env: - name: ELASTICSEARCH_HOSTS #注意value是es的services,因為es是有狀態,用的無頭服務,所以連線的就不僅僅是pod的名字了 value: http://elasticsearch:9200 ports: - containerPort: 5601 name: ui protocol: TCP --- apiVersion: v1 kind: Service metadata: name: kibana namespace: loging spec: ports: - port: 5601 protocol: TCP targetPort: ui selector: k8s-app: kibana ``` 配置ingress-controller。因為我司用的是阿里雲託管的k8s自帶的nginx-ingress,並且配置了強制轉換https。所以kibana-ingress也要配成https。 ```shell # openssl genrsa -out tls.key 2048 # openssl req -new -x509 -key tls.key -out tls.crt -subj /C=CN/ST=Beijing/L=Beijing/O=DevOps/CN=kibana.test.realibox.com # kubectl create secret tls kibana-ingress-secret --cert=tls.crt --key=tls.key ``` kibana-ingress配置如下。提供兩種,一種是https,一種是https。 ```shell https: # cat kibana-ingress.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: kibana namespace: loging spec: tls: - hosts: - kibana.test.realibox.com secretName: kibana-ingress-secret rules: - host: kibana.test.realibox.com http: paths: - path: / backend: serviceName: kibana servicePort: 5601 http: # cat kibana-ingress.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: kibana namespace: logging spec: rules: - host: kibana.test.realibox.com http: paths: - path: / backend: serviceName: kibana servicePort: 5601 ``` 部署lo。因為lo的作用是對fi收集到的日誌進行過濾,需要根據不同的日誌做不同的處理,所以可能要經常性的進行改動,要進行解耦。所以選擇以configmap的形式進行掛載。 ```shell # cat logstash.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: logstash namespace: loging spec: replicas: 1 selector: matchLabels: app: logstash template: metadata: labels: app: logstash spec: containers: - name: logstash image: elastic/logstash:7.3.1 volumeMounts: - name: config mountPath: /opt/logstash/config/containers.conf subPath: containers.conf command: - "/bin/sh" - "-c" - "/opt/logstash/bin/logstash -f /opt/logstash/config/containers.conf" volumes: - name: config configMap: name: logstash-k8s-config --- apiVersion: v1 kind: Service metadata: labels: app: logstash name: logstash namespace: loging spec: ports: - port: 8080 targetPort: 8080 selector: app: logstash type: ClusterIP # cat logstash-config.yaml --- apiVersion: v1 kind: Service metadata: labels: app: logstash name: logstash namespace: loging spec: ports: - port: 8080 targetPort: 8080 selector: app: logstash type: ClusterIP --- apiVersion: v1 kind: ConfigMap metadata: name: logstash-k8s-config namespace: loging data: containers.conf: | input { beats { port => 8080 #filebeat連線埠 } } output { elasticsearch { hosts => ["elasticsearch:9200"] #es的service index => "logstash-%{+YYYY.MM.dd}" } } 注意:修改configmap 相當於修改映象。必須重新apply 應用資源清單才能生效。根據資料採集流程圖,lo的資料由fi流入,流向es。 ``` 部署fi。fi的主要作用是進行日誌的採集,然後將資料交給lo。 ```yaml # cat filebeat.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-config namespace: loging labels: app: filebeat data: filebeat.yml: |- filebeat.config: inputs: # Mounted `filebeat-inputs` configmap: path: ${path.config}/inputs.d/*.yml # Reload inputs configs as they change: reload.enabled: false modules: path: ${path.config}/modules.d/*.yml # Reload module configs as they change: reload.enabled: false # To enable hints based autodiscover, remove `filebeat.config.inputs` configuration and uncomment this: #filebeat.autodiscover: # providers: # - type: kubernetes # hints.enabled: true output.logstash: hosts: ['${LOGSTASH_HOST:logstash}:${LOGSTASH_PORT:8080}'] #流向lo --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-inputs namespace: loging labels: app: filebeat data: kubernetes.yml: |- - type: docker containers.ids: - "*" processors: - add_kubernetes_metadata: in_cluster: true --- apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: filebeat namespace: loging labels: app: filebeat spec: selector: matchLabels: app: filebeat template: metadata: labels: app: filebeat spec: serviceAccountName: filebeat terminationGracePeriodSeconds: 30 containers: - name: filebeat image: elastic/filebeat:7.3.1 args: [ "-c", "/etc/filebeat.yml", "-e", ] env: #注入變數 - name: LOGSTASH_HOST value: logstash - name: LOGSTASH_PORT value: "8080" securityContext: runAsUser: 0 # If using Red Hat OpenShift uncomment this: #privileged: true resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/filebeat.yml readOnly: true subPath: filebeat.yml - name: inputs mountPath: /usr/share/filebeat/inputs.d readOnly: true - name: data mountPath: /usr/share/filebeat/data - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true volumes: - name: config configMap: defaultMode: 0600 name: filebeat-config - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: inputs configMap: defaultMode: 0600 name: filebeat-inputs # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart - name: data hostPath: path: /var/lib/filebeat-data type: DirectoryOrCreate --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: filebeat subjects: - kind: ServiceAccount name: filebeat namespace: loging roleRef: kind: ClusterRole name: filebeat apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: filebeat labels: app: filebeat rules: - apiGroups: [""] # "" indicates the core API group resources: - namespaces - pods verbs: - get - watch - list --- apiVersion: v1 kind: ServiceAccount metadata: name: filebeat namespace: loging labels: app: filebeat --- ``` 至此完成在k8s上部署es+lo+fi+ki ,進行簡單驗證。 #### 驗證 檢視svc、pod、ingress資訊 ```shell # kubectl get svc,pods,ingress -n loging NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/elasticsearch ClusterIP None 9200/TCP,9300/TCP 151m service/kibana ClusterIP xxx.168.239.2xx 5601/TCP 20h service/logstash ClusterIP xxx.168.38.1xx 8080/TCP 122m NAME READY STATUS RESTARTS AGE pod/elasticsearch-0 1/1 Running 0 151m pod/filebeat-24zl7 1/1 Running 0 118m pod/filebeat-4w7b6 1/1 Running 0 118m pod/filebeat-m5kv4 1/1 Running 0 118m pod/filebeat-t6x4t 1/1 Running 0 118m pod/kibana-689f4bd647-7jrqd 1/1 Running 0 20h pod/logstash-76bc9b5f95-qtngp 1/1 Running 0 122m NAME HOSTS ADDRESS PORTS AGE ingress.extensions/kibana kibana.test.realibox.com xxx.xx.xx.xxx 80, 443 19h ``` ##### web配置 配置索引 ![](https://img2020.cnblogs.com/blog/1464583/202006/1464583-20200610134908513-1809544000.png) ![](https://img2020.cnblogs.com/blog/1464583/202006/1464583-20200610134921517-1780331850.png) 發現 ![](https://img2020.cnblogs.com/blog/1464583/202006/1464583-20200610134939077-124823527.png) 至此算是簡單完成。後續需要不斷優化,不過那是後事了。 #### 問題總結 這應該算是第一次親自在測試&生產環境部署應用了,而且是自己很不熟悉的日子系統,遇到了很多問題,需要總結。 1. 如何調研一項技術棧; 2. 如何選定方案; 3. 因為網上幾乎沒有找到類似的方案(也不曉得別的公司是怎麼搞的,反正網上找不到有效的可能借鑑的)。需要自己根據不同的文件總結嘗試; 4. 一個元件的標籤儘可能一致; 5. 如何檢視公司是否做了埠限制和https強制轉換; 6. 遇到IT的事一定要看日誌,這點很重要,日誌可以解決絕大多數問題; 7. 一個人再怎麼整也會忽略一些點,自己先嚐試然後請教朋友,共同進步。 8. 專案先上線再說別的,目前是這樣,一件事又百分之20的把握就可以去做了。百分之80再去做就沒啥意思了。 9. 自學重點學的是理論,公司才能學到