1. 程式人生 > >【原創】《從0開始學Elasticsearch》—初識Elasticsearch

【原創】《從0開始學Elasticsearch》—初識Elasticsearch

roo 原創 能力 eap 發生 using top .tar.gz nes

目錄

1. Elasticsearch 是什麽2. Elasticsearch 中基本概念3. Elasticsearch 安裝4. 訪問 Elasticsearch

1. Elasticsearch 是什麽

Elasticsearch 是一個基於 Lucene 的實時的分布式搜索分析引擎,開箱即用,整合了全文檢索、結構化搜索、分析三大功能。
為什麽不直接用 Lucene ?Lucene 只是一個全文檢索引擎的架構,提供了大量可用的 API,但其並不是一個完整的全文檢索引擎,使用 Lucene 時,你還需要自己寫代碼,自己去封裝成全文檢索引擎。

2. Elasticsearch 中基本概念

  • field:字段。
  • Document :文檔,一條數據,用 json 格式表示。一個Document 包含多個field,json 中的 key 即 field 。
  • Type:類型,一個 Document 分組,和 mysql 中的 table 類似,但又不完全相同。一個 Type 包含多個Document,同一個 Type 中的 Document 所擁有的 field 可以不同,但最好保持一致。
  • Index :索引,類似於 mysql 中的 database。一個 Index 包含多個 Type。默認情況下,Document 中的所有 field 都會被索引,這樣這些 field 才會被搜索到。Elasticsearch 中有一個倒排索引(Inverted Index)的概念,可以實現 mysql 中 B+Tree索引加速檢索的目的,後面文章我們會詳細介紹倒排索引。
  • shard:分片。可以將一個 Index 中的數據切分為多個 shard,然後將之存儲在多臺服務器上,以增大一個 Index 可以存儲的數據量,加速檢索能力,提升系統性能。
  • replica :副本。replica 與 shard 存儲的數據是相同的,replica 起到備份的作用。當 shard 發生故障時,可以從 replica 中讀取數據,保證系統不受影響。
  • Node:節點,單個 Elasticsearch 實例。節點名稱默認隨機分配。
  • Cluster:集群,一組 Elasticsearch 實例。默認集群名稱為 elasticsearch。

3. Elasticsearch 安裝

前提條件:系統中已成功安裝 jdk8
下載並解壓:

cd /usr/local
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.0.tar.gz
tar -zxvf elasticsearch-6.6.0.tar.gz -C .

查看解壓後的目錄:

[root@153-215 local]# cd elasticsearch-6.6.0
[root@153-215 elasticsearch-6.6.0]# ls
bin config lib LICENSE.txt logs modules NOTICE.txt plugins README.textile

啟動 Elasticsearch:

[root@153-215 elasticsearch-6.6.0]# bin/elasticsearch
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000d4cc0000, 724828160, 0) failed; error=‘Cannot allocate memory‘ (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 724828160 bytes for committing reserved memory.
# An error report file with more information is saved as:
# logs/hs_err_pid16393.log

遂,查看 Elasticsearch 的啟動腳本,看啟動時是否對內存大小有要求:

[root@153-215 elasticsearch-6.6.0]# vim bin/elasticsearch
#!/bin/bash

# CONTROLLING STARTUP:
#
# This script relies on a few environment variables to determine startup
# behavior, those variables are:
#
# ES_PATH_CONF -- Path to config directory
# ES_JAVA_OPTS -- External Java Opts on top of the defaults set
#
# Optionally, exact memory values can be set using the `ES_JAVA_OPTS`. Note that
# the Xms and Xmx lines in the JVM options file must be commented out. Example
# values are "512m", and "10g".
#
# ES_JAVA_OPTS="-Xms8g -Xmx8g" ./bin/elasticsearch

source "`dirname "$0"`"/elasticsearch-env

ES_JVM_OPTIONS="$ES_PATH_CONF"/jvm.options
JVM_OPTIONS=`"$JAVA" -cp "$ES_CLASSPATH" org.elasticsearch.tools.launchers.JvmOptionsParser "$ES_JVM_OPTIONS"`
ES_JAVA_OPTS="${JVM_OPTIONS//\$\{ES_TMPDIR\}/$ES_TMPDIR} $ES_JAVA_OPTS"
......

發現 Elasticsearch 啟動時,讀取了 jvm.options 文件,於是查看該文件:

[root@153-215 elasticsearch-6.6.0]# ls config
elasticsearch.yml jvm.options log4j2.properties role_mapping.yml roles.yml users users_roles
[root@153-215 elasticsearch-6.6.0]# cat config/jvm.options
## JVM configuration

################################################################
## IMPORTANT: JVM heap size
###
#############################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g
......

修改 jvm 的最大可用內存和最小可用內存如下:

-Xms256m
-Xmx256m

再次啟動 Elasticsearch:

[root@153-215 elasticsearch-6.6.0]# bin/elasticsearch
[2019-02-13T16:42:53,177][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [unknown] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.6.0.jar:6.6.0]
at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.6.0.jar:6.6.0]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.6.0.jar:6.6.0]
Caused by: java.lang.RuntimeException: can not run elasticsearch as root
at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:103) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:170) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.6.0.jar:6.6.0]
... 6 more

這段報錯信息也就是說,不能以 root 用戶的身份啟動 Elasticsearch,這一要求也是出於系統安全考慮,所以此處我先將 Elasticsearch 目錄及目錄內文件的擁有者修改為另一個用戶,然後再用另一個用戶啟動:

[root@153-215 elasticsearch-6.6.0]# cd ..
[root@153-215 local]# chown -R lilinru:lilinru elasticsearch-6.6.0
[root@153-215 local]# su lilinru
[lilinru@153-215 local]$ cd elasticsearch-6.6.0
[lilinru@153-215 elasticsearch-6.6.0]$ bin/elasticsearch
....
[2019-02-13T17:10:23,443][INFO ][o.e.n.Node ] [_xV7bTf] starting ...
[2019-02-13T17:10:23,618][INFO ][o.e.t.TransportService ] [_xV7bTf] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2019-02-13T17:10:23,636][WARN ][o.e.b.BootstrapChecks ] [_xV7bTf] max file descriptors [65535] for elasticsearch process is too low, increase to at least [65536]
[2019-02-13T17:10:23,636][WARN ][o.e.b.BootstrapChecks ] [_xV7bTf] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
....

發現啟動時存在兩個問題:
問題一: max file descriptors [65535] for elasticsearch process is too low, increase to at least [65536]
解決此問題,我們可以編輯 /etc/security/limits.conf 文件最底端 soft nofilehard nofile 的配置為 65536:

[root@153-215 elasticsearch-6.6.0]# vim /etc/security/limits.conf 
...
# End of file
...
* soft nofile 65536
* hard nofile 65536
...

問題二:max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
解決此問題,我們可以編輯 /etc/sysctl.conf 文件,在文件最底端添加如下配置:

vm.max_map_count=262144

註意添加完該配置,還需要執行一下 sysctl -p 命令,重新加載一下 sysctl.conf 配置文件。

解決完上述兩個問題,再次重啟 Elasticsearch,發現上述兩個問題都木有了,且啟動成功~

4. 訪問 Elasticsearch

打開另外一個窗口,請求 Elasticsearch:

[root@153-215 ~]# curl localhost:9200
{
"name" : "_xV7bTf",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "i3whIPX_Qx2zvaJVZKQY1g",
"version" : {
"number" : "6.6.0",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "a9861f4",
"build_date" : "2019-01-24T11:27:09.439740Z",
"build_snapshot" : false,
"lucene_version" : "7.6.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}

可以看到,Elasticsearch 返回了一個 json 對象,其中包含當前節點名稱、集群名稱、集群 uuid、版本信息、宣傳語。

Elasticsearch 的基本認識就先寫到這裏,後續我們再一步步深入了解 Elasticsearch,使用 Elasticsearch。

【原創】《從0開始學Elasticsearch》—初識Elasticsearch