1. 程式人生 > >利用HDFS實現ElasticSearch7.2容災方案

利用HDFS實現ElasticSearch7.2容災方案

# 利用HDFS實現ElasticSearch7.2容災方案 [toc] # 前言 ​ Elasticsearch 副本提供了高可靠性,它們讓你可以容忍零星的節點丟失而不會中斷服務。但是,副本並不提供對災難性故障的保護。對這種情況,就需要的是對叢集真正的備份(在某些東西確實出問題的時候有一個完整的拷貝)。 ​ 案例模擬ElasticSearch7.2叢集環境,採用`snapshot` API基於快照的方式備份叢集。 ​ 案例演示`HDFS`分散式檔案系統作為倉庫舉例。 # 快照版本相容 ![](http://bed.thunisoft.com:9000/ibed/2020/10/11/AtK7rrrai.png) # 備份叢集 ## HDFS檔案系統 ### 軟體下載 [下載地址](https://downloads.apache.org/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz) ``` hadoop-3.3.0.tar.gz ``` ### JDK環境 > hadoop java編寫,執行需依賴jvm ``` jdk-8u161-linux-x64.tar.gz ``` ### 配置系統環境變數 ```shell #JAVA export JAVA_HOME=/home/hadoop/jdk1.8.0_161 export CLASSPATH=$JAVA_HOME/libdt.jar:$JAVA_HOME/tools.jar #hadoop export HADOOP_HOME=/home/hadoop/hadoop-3.3.0 export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin ``` ### hadoop配置 > hadoop-3.3.0/etc/hadoop 的目錄下 #### 配置JAVA_HOME hadoop-env.sh ```shell export JAVA_HOME=/home/hadoop/jdk1.8.0_161 ``` #### 配置核心元件檔案 core-site.xml需要在
之間新增 ```xml fs.defaultFS hdfs://172.16.176.103:9000 hadoop.tmp.dir /data ``` #### 配置檔案系統 hdfs-site.xml需要在之間新增 ```xml dfs.namenode.name.dir /data/namenode dfs.datanode.data.dir /data/datanode dfs.replication 1 dfs.permissions
false
``` #### 配置mapred mapred-site.xml ```xml mapreduce.framework.name yarn ``` #### 配置 yarn-site.xml yarn-site.xml ```xml yarn.resourcemanager.hostname elasticsearch01 ``` ### 格式化檔案系統 ```` hdfs namenode -format ```` ### 啟動hdfs start-dfs.sh ```shell $ start-dfs.sh WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER. Starting namenodes on [host103] Starting datanodes Starting secondary namenodes [host103] ``` ### 訪問 ``` http://localhost:9870/ ``` ## ES外掛安裝 叢集中每個節點都必須安裝hdfs外掛,安裝後需`重啟ES` ### 外掛下載 >
外掛版本和ES版本相對應 [下載地址](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-7.2.0.zip) ``` repository-hdfs-7.2.0.zip ``` ### 外掛安裝 > 提前下載軟體包,離線安裝 叢集中各節點依次安裝 sudo bin/elasticsearch-plugin install file:///path/to/plugin.zip ```shell $ ./elasticsearch-plugin install file:///home/es/repository-hdfs-7.2.0.zip -> Downloading file:///home/es/repository-hdfs-7.2.0.zip [=================================================] 100%   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: plugin requires additional permissions @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ * java.lang.RuntimePermission accessClassInPackage.sun.security.krb5 * java.lang.RuntimePermission accessDeclaredMembers * java.lang.RuntimePermission getClassLoader * java.lang.RuntimePermission loadLibrary.jaas * java.lang.RuntimePermission loadLibrary.jaas_nt * java.lang.RuntimePermission loadLibrary.jaas_unix * java.lang.RuntimePermission setContextClassLoader * java.lang.RuntimePermission shutdownHooks * java.lang.reflect.ReflectPermission suppressAccessChecks * java.net.SocketPermission * connect,resolve * java.net.SocketPermission localhost:0 listen,resolve * java.security.SecurityPermission insertProvider.SaslPlainServer * java.security.SecurityPermission putProviderProperty.SaslPlainServer * java.util.PropertyPermission * read,write * javax.security.auth.AuthPermission doAs * javax.security.auth.AuthPermission getSubject * javax.security.auth.AuthPermission modifyPrincipals * javax.security.auth.AuthPermission modifyPrivateCredentials * javax.security.auth.AuthPermission modifyPublicCredentials * javax.security.auth.PrivateCredentialPermission javax.security.auth.kerberos.KerberosTicket * "*" read * javax.security.auth.PrivateCredentialPermission javax.security.auth.kerberos.KeyTab * "*" read * javax.security.auth.PrivateCredentialPermission org.apache.hadoop.security.Credentials * "*" read * javax.security.auth.kerberos.ServicePermission * initiate See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html for descriptions of what these permissions allow and the associated risks. Continue with installation? [y/N]y -> Installed repository-hdfs $ ``` ## 建立倉庫 - 建立 ```json PUT _snapshot/my_hdfs_repository { "type": "hdfs", --型別 "settings": { "uri": "hdfs://172.16.176.103:9000/", --hdfs訪問url "path": "/data", "conf.dfs.client.read.shortcircuit": "false" } } ``` - 檢視 ```json GET /_snapshot { "my_hdfs_repository" : { "type" : "hdfs", "settings" : { "path" : "/data", "uri" : "hdfs://172.16.176.103:9000/", "conf" : { "dfs" : { "client" : { "read" : { "shortcircuit" : "false" } } } } } } } ``` ## 建立快照 - 建立快照 不等待快照完成,即刻返回結果 ```shell PUT _snapshot/my_hdfs_repository/snapshot_i_xfjbblxt_cxfw_xfj_d12 { "indices": "i_xfjbblxt_cxfw_xfj_d12" } ``` - 檢視快照當前狀態 ```json GET _snapshot/my_hdfs_repository/snapshot_i_xfjbblxt_cxfw_xfj_d12 { "snapshots" : [ { "snapshot" : "snapshot_i_xfjbblxt_cxfw_xfj_d12", "uuid" : "-BS9XjxvS1Sp6wW_bT02lA", "version_id" : 7020099, "version" : "7.2.0", "indices" : [ "i_xfjbblxt_cxfw_xfj_d12" ], "include_global_state" : true, "state" : "IN_PROGRESS", --正在做快照中 "start_time" : "2020-10-12T14:04:49.425Z", --開始時間 "start_time_in_millis" : 1602511489425, "end_time" : "1970-01-01T00:00:00.000Z", "end_time_in_millis" : 0, "duration_in_millis" : -1602511489425, "failures" : [ ], "shards" : { "total" : 0, "failed" : 0, "successful" : 0 } } ] } ``` - 完成狀態 ```json { "snapshots" : [ { "snapshot" : "snapshot_i_xfjbblxt_cxfw_xfj_d12", --快照名稱 "uuid" : "-BS9XjxvS1Sp6wW_bT02lA", "version_id" : 7020099, "version" : "7.2.0", "indices" : [ "i_xfjbblxt_cxfw_xfj_d12" --索引 ], "include_global_state" : true, "state" : "SUCCESS", --快照成功 "start_time" : "2020-10-12T14:04:49.425Z", --開始時間 "start_time_in_millis" : 1602511489425, --開始時間戳 "end_time" : "2020-10-12T14:24:33.942Z", --結束時間 "end_time_in_millis" : 1602512673942, --結束時間戳 "duration_in_millis" : 1184517, --耗時(毫秒) "failures" : [ ], "shards" : { "total" : 5, --總分片 "failed" : 0, "successful" : 5 --成功分片 } } ] } ``` ## 恢復快照 **快照恢復如果恢復到原索引中,需要先把原索引關閉或者先刪除後,在進行快照恢復** - 恢復快照 ```json POST _snapshot/my_hdfs_repository/snapshot_i_xfjbblxt_cxfw_xfj_d12/_restore { "indices": "i_xfjbblxt_cxfw_xfj_d12" --快照備份索引名稱 ,"rename_pattern": "i_xfjbblxt_cxfw_xfj_d12" --檢索匹配到的索引名稱 , "rename_replacement": "restored_i_xfjbblxt_cxfw_xfj_d12" --重新命名索引 } ``` - 狀態檢視 ```json { "restored_i_xfjbblxt_cxfw_xfj_d12" : { "shards" : [ { "id" : 4, "type" : "SNAPSHOT", "stage" : "INDEX", "primary" : true, "start_time_in_millis" : 1602571287856, "total_time_in_millis" : 1249147, "source" : { "repository" : "my_hdfs_repository", "snapshot" : "snapshot_i_xfjbblxt_cxfw_xfj_d12", "version" : "7.2.0", "index" : "i_xfjbblxt_cxfw_xfj_d12", "restoreUUID" : "KM1EaKsAQkO4OxB0PwKe0Q" }, "target" : { "id" : "DWvUrfqQRxGLIWm6SQmunA", "host" : "172.16.176.104", "transport_address" : "172.16.176.104:9300", "ip" : "172.16.176.104", "name" : "node-104" }, "index" : { "size" : { "total_in_bytes" : 8312825377, "reused_in_bytes" : 0, "recovered_in_bytes" : 6781859331, "percent" : "81.6%" }, "files" : { "total" : 104, "reused" : 0, "recovered" : 86, "percent" : "82.7%" }, "total_time_in_millis" : 1249039, "source_throttle_time_in_millis" : 0, "target_throttle_time_in_millis" : 0 }, "translog" : { "recovered" : 0, "total" : 0, "percent" : "100.0%", "total_on_start" : 0, "total_time_in_millis" : 0 }, "verify_index" : { "check_index_time_in_millis" : 0, "total_time_in_millis" : 0 } }, --部分省略 ``` # 備份恢復時間 ## 案例快照詳情 > 第一次快照 | 節點數 | 主分片 | 副本分配 | 資料量 | 大小 | 快照大小 | 耗時(快照) | | ------ | ------ | -------- | ------- | ------ | -------- | ------------ | | 3 | 5 | 1 | 5149535 | 77.4gb | 40gb | 19.74195分鐘 | ## 案例快照恢復詳情 **快照恢復過程為並行恢復** | 分片 | 耗時(恢復) | 恢復位元組 | | ------- | ------------ | -------- | | 0(主) | 27.42分鐘 | 7.75G | | 1(主) | 27.14分鐘 | 7.72G | | 2(主) | 27.45分鐘 | 7.75G | | 3(主) | 25.89分鐘 | 7.74G | | 4(主) | 25.5分鐘 | 7.74G | | 0(副) | 18.65分鐘 | 7.75G | | 1(副) | 10.3分鐘 | 7.72G | | 2(副) | 17.21分鐘 | 7.75G | | 3(副) | 10.6分鐘 | 7.74G | | 4(副) | 18.32分鐘 | 7.74G | # 常見問題 ## 啟動hdfs ### 問題1 ```shell $ start-dfs.sh WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER. Starting namenodes on [host103] Last login: Sun Oct 11 22:32:11 CST 2020 from 172.16.176.46 on pts/1 host103: ERROR: JAVA_HOME is not set and could not be found. Starting datanodes Last login: Sun Oct 11 22:32:23 CST 2020 on pts/1 localhost: ERROR: JAVA_HOME is not set and could not be found. Starting secondary namenodes [host103] Last login: Sun Oct 11 22:32:24 CST 2020 on pts/1 host103: ERROR: JAVA_HOME is not set and could not be found. ``` - 解決 配置java環境變數 ```shell export JAVA_HOME=/home/hadoop/jdk1.8.0_161 export CLASSPATH=$JAVA_HOME/libdt.jar:$JAVA_HOME/tools.jar export PATH=$JAVA_HOME/bin:$PATH ``` ### 問題2 ```shell $ start-dfs.sh WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER. Starting namenodes on [host103] host103: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). Starting datanodes localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). Starting secondary namenodes [host103] host103: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). ``` - 解決 > hadoop使用者執行 ```shell [hadoop@host103 ~]$ ssh-copy-id hadoop@host103 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys hadoop@host103's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'hadoop@host103'" and check to make sure that only the key(s) you wanted were added. ``` ## 建立倉庫 ### 問題1 - 建立 ```json PUT _snapshot/my_hdfs_repository { "type": "hdfs", "settings": { "uri": "hdfs://172.16.176.103:9000/", "path": "/", "conf.dfs.client.read.shortcircuit": "false" } } ``` - 錯誤 ```shell error": { "root_cause": [ { "type": "repository_exception", "reason": "[my_hdfs_repository] cannot create blob store" } ], "type": "repository_exception", "reason": "[my_hdfs_repository] cannot create blob store", "caused_by": { "type": "unchecked_i_o_exception", "reason": "Cannot create HDFS repository for uri [hdfs://172.16.176.103:9000/]", "caused_by": { "type": "access_control_exception", "reason": "Permission denied: user=es, access=WRITE, inode=\"/\":hadoop:supergroup:drwxr-xr-x\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:496)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:336)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:360)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:239)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1909)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1893)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1852)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3407)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1161)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:739)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)\n", ``` - 問題解決 新增hdfs-site.xml ```xml dfs.permissions false ``` # 參考文件 - HDFS外掛 https://www.elastic.co/guide/en/elasticsearch/plugins/7.2/repository-hdfs.html - HDFS SingleCluster https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleClust