ElasticSearch 叢集 & 資料備份 & 優化

阿新 • • 發佈：2020-08-12

ElasticSearch 叢集相關概念

ES 叢集顏色狀態

①. — 紅色：資料都不完整

②. — 黃色：資料完整，但是副本有問題

③. — 綠色：資料和副本全都沒有問題

ES 叢集節點型別

①. — 主節點：負責排程分配資料

②. — 資料節點：處理分配到自己的資料

ES 叢集分片型別

①. — 主分片：儲存資料，負責讀寫資料

②. — 副本分片：主分片的備份

ES 叢集安全保障

①. — 資料會自動分配到多個節點

②. — 如果主分片所在節點掛掉，副本節點的分片會自動升為主分片

③. — 如果主節點掛了，資料節點會自動提升為主節點

ES 叢集配置注意事項

①. — 叢集節點的配置，不需要將所有節點的 IP 都寫入配置檔案，只需要寫本機 IP 和叢集中任意一臺機器的 IP 即可：

# 修改 /etc/elasticsearch/elasticsearch.yml 配置檔案
122 配置：    discovery.zen.ping.unicast.hosts: ["10.0.0.121", "10.0.0.122"]
123 配置：    discovery.zen.ping.unicast.hosts: ["10.0.0.121", "10.0.0.123"]
xxx 配置：    discovery.zen.ping.unicast.hosts: ["10.0.0.121", "10.0.0.xxx"]

②. — 叢集選舉節點配置數量，一定是 N（叢集節點總數）/2+1：

# 修改 /etc/elasticsearch/elasticsearch.yml 配置檔案，當前叢集節點總數 N = 3
discovery.zen.minimum_master_nodes: 2

③. — ES 預設 5 個分片 1 個副本，索引建立以後，分片數量不得修改，副本數可以修改

④. — 資料分配時，分片顏色：

1）紫色：資料正在遷移（擴充套件節點時會遇到）

2）黃色：資料正在複製（節點宕機，其他節點需要補全分片副本）

⑤. — 當叢集共有三個節點時，根據配置的分片副本數，可發生的故障：

1）三個節點，沒有副本時，一臺機器都不能壞

2）三個節點，一個副本時，可以壞兩臺，但是隻能一臺一臺壞（要時間複製生成新的副本）

3）三個節點，兩個副本時，可以壞兩臺（一起壞）

ES 叢集相關命令

# ======= ES 叢集狀態 ======= #
# 1.檢視主節點
GET _cat/master

# 2.檢視叢集健康狀態
GET _cat/health

# 3.檢視索引
GET _cat/indices

# 4.檢視所有節點
GET _cat/nodes

# 5.檢視分片
GET _cat/shards

# 一般可以通過以下兩個命令監控叢集的健康狀態，兩者有一個發變化，說明叢集發生故障
GET _cat/health
GET _cat/nodes
# 實際上 Kibana 會內建 X-Pack 軟體，監控叢集的健康狀態

ElasticSearch 叢集配置修改

配置分片數 & 副本數

ES 預設 5 個分片 1 個副本，索引建立以後，分片數量不得修改，副本數可以修改

# ======= 配置檔案 ======= #
# 修改 /etc/elasticsearch/elasticsearch.yml 配置引數
# 設定索引的分片數 ， 預設為 5 
index.number_of_shards: 5  

# 設定索引的副本數 ， 預設為 1  
index.number_of_replicas: 1

修改指定索引副本數

PUT /index/_settings
{
  "number_of_replicas": 2
}

修改所有索引副本數

PUT _all/_settings
{
  "number_of_replicas": 2
}

建立索引時指定分片數 & 副本數

PUT /testone
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2
  }
}

注意，分片數不是越多越好：

①. — 分片數不是越多越好，會佔用資源

②. — 每個分片都會佔用檔案控制代碼數

③. — 查詢資料時會根據演算法去指定節點獲取資料，分片數越少，查詢成本越低

分片數 & 副本數配置建議

①. — 跟開發溝通

②. — 看一共要幾個節點

— 如果 2 個節點，預設就可以（1 副本 5 分片）

— 如果 3 個節點，重要的資料，2 副本 5 分片，不重要的資料，1 副本 5 分片

③. — 在開始階段，一個好的方案是根據你的節點數量按照 1.5 ~ 3 倍的原則來建立分片

— 例如：如果你有 3 個節點，則推薦你建立的分片數最多不超過 9（3 x 3）個

④. — 儲存資料量多的可以設定分片多一些，儲存資料量少的，可以少分些分片

ElasticSearch 配置優化

限制記憶體

1.啟動記憶體最大是 32G
2.伺服器一半的記憶體全都給 ES
3.設定可以先給小一點，慢慢提高
4.記憶體不足時
	1）讓開發刪除資料
	2）加節點
	3）提高配置
5.關閉 swap 空間

檔案控制代碼數

# 配置檔案描述符
[root@db02 ~]# vim /etc/security/limits.conf
* soft memlock unlimited
* hard memlock unlimited
* soft nofile 131072
* hard nofile 131072



# 普通使用者（CentOS7）
[root@db02 ~]# vim /etc/security/limits.d/20-nproc.conf 
*          soft    nproc     65535
root       soft    nproc     unlimited

# 普通使用者（CentOS6）
[root@db02 ~]# vim /etc/security/limits.d/90-nproc.conf 
*          soft    nproc     65535
root       soft    nproc     unlimited

語句優化

1.條件查詢時，使用term查詢，減少range的查詢
2.建索引的時候，儘量使用命中率高的詞

ElasticSearch 資料備份與恢復

安裝 npm 環境

# Linux-nodeJS 安裝 V12
# 第一步：配置 yum 源
curl --silent --location https://rpm.nodesource.com/setup_12.x | sudo bash -

# 第二步：安裝包
yum install -y nodejs

# 第三步：驗證安裝是否成功
node -v

# 第四步：設定淘寶映象
npm config set registry http://registry.npm.taobao.org

安裝備份工具

[root@db01 ~]# npm install elasticdump -g

備份命令

# 幫助文件
https://github.com/elasticsearch-dump/elasticsearch-dump

備份引數

--input: 資料來源
--output: 接收資料的目標
--type: 匯出的資料型別（settings, analyzer, data, mapping, alias, template）

備份資料到叢集

# 備份資料到另一個 ES 叢集
elasticdump \
  --input=http://10.0.0.121:9200/my_index \
  --output=http://10.0.0.51:9200/my_index \
  --type=analyzer
 
elasticdump \
  --input=http://10.0.0.121:9200/my_index \
  --output=http://10.0.0.51:9200/my_index \
  --type=mapping
  
elasticdump --input=http://10.0.0.121:9200/my_index --output=http://10.0.0.51:9200/my_index --type=data

elasticdump \
  --input=http://10.0.0.121:9200/my_index \
  --output=http://10.0.0.51:9200/my_index \
  --type=template

備份資料到本地

elasticdump \
  --input=http://10.0.0.121:9200/student \
  --output=/tmp/student_mapping.json \
  --type=mapping
  
elasticdump \
  --input=http://10.0.0.121:9200/student \
  --output=/tmp/student_data.json \
  --type=data

匯出檔案打包

elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=$ \
  | gzip > /data/my_index.json.gz

備份指定條件的資料

elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=query.json \
  --searchBody="{\"query\":{\"term\":{\"username\": \"admin\"}}}"

匯入資料

elasticdump \
  --input=./student_template.json \
  --output=http://10.0.0.121:9200 \
  --type=template
  
elasticdump \
  --input=./student_mapping.json \
  --output=http://10.0.0.121:9200 \
  --type=mapping
  
elasticdump \
  --input=./student_data.json \
  --output=http://10.0.0.121:9200 \
  --type=data
  
elasticdump \
  --input=./student_analyzer.json \
  --output=http://10.0.0.121:9200 \
  --type=analyzer

# 恢復資料的時候，如果資料已存在，會覆蓋原資料

備份指令碼

[root@dbtest03 test]# cat bak.sh
#!/bin/bash
# 備份叢集節點 IP 
host_ip=10.0.0.121
index_name='
student
teacher
abc
'
for index in `echo $index_name`
do
    echo "start input index ${index}"
    elasticdump --input=http://${host_ip}:9200/${index} --output=/data/${index}_alias.json --type=alias &> /dev/null
    elasticdump --input=http://${host_ip}:9200/${index} --output=/data/${index}_analyzer.json --type=analyzer &> /dev/null
    elasticdump --input=http://${host_ip}:9200/${index} --output=/data/${index}_data.json --type=data &> /dev/null
    elasticdump --input=http://${host_ip}:9200/${index} --output=/data/${index}_alias.json --type=alias &> /dev/null
    elasticdump --input=http://${host_ip}:9200/${index} --output=/data/${index}_template.json --type=template &> /dev/null
done

匯入指令碼

[root@dbtest03 test]# cat imp.sh
#!/bin/bash
# 匯入叢集節點 IP 
host_ip=10.0.0.121
index_name='
abc
student
'
for index in `echo $index_name`
do
    echo "start input index ${index}"
    elasticdump --input=/data/${index}_alias.json --output=http://${host_ip}:9200/${index} --type=alias &> /dev/null
    elasticdump --input=/data/${index}_analyzer.json --output=http://${host_ip}:9200/${index} --type=analyzer &> /dev/null
    elasticdump --input=/data/${index}_data.json --output=http://${host_ip}:9200/${index} --type=data &> /dev/null
    elasticdump --input=/data/${index}_template.json --output=http://${host_ip}:9200/${index} --type=template &> /dev/null
done

ElasticSearch 叢集 & 資料備份 & 優化

ElasticSearch 叢集相關概念 ES 叢集顏色狀態 ①. — 紅色：資料都不完整 ②. — 黃色：資料完整，但是副本有問題

【SpringMVC】資料轉換 & 資料格式化 & 資料校驗

資料轉換 & 資料格式化 & 資料校驗資料繫結流程 Spring MVC 主框架將 ServletRequest 物件及目標方法的入參例項傳遞給 WebDataBinderFactory 例項，以建立 DataBinder 例項物件

資料轉換&資料格式化&資料校驗

1.資料繫結流程① Spring MVC 主框架將 ServletRequest 物件及目標方法的入參例項傳遞給 WebDataBinderFactory 例項，以建立 DataBinder 例項物件② DataBinder 呼叫裝配在 Spring MVC 上下文中的 ConversionServic