在Linux上安裝Elasticsearch5.x

阿新 • • 發佈：2017-12-23

直接 master ted 管理日誌采集 generated uuid unit security

這裏使用elasticsearch做全文檢索，不是ELK日誌采集。

elasticsearch作為全文檢索，必須服務端和客服端的版本一致，所以在安裝elasticsearch時，要註意版本問題。

前言

這裏我的系統是阿裏雲的 CentOS 7.2 64位，2核8G
$開始的命令表示你要輸入的命令

一、JAVA環境配置

ElasticSearch是基於lucence開發的，也就是運行需要java支持。所以要先安裝JAVA環境。由於ElasticSearch 5.x 依賴於JDK 1.8的，所以現在我們下載JDK 1.8或者更高版本。以下命令將會下載最新的jdk

$ yum install java

安裝完畢後測試

$ java -version

二、Elasticsearch引擎安裝

1、創建帳號和分配權限

官方文檔上說Elasticsearch不適合在root管理員帳號下運行，所以要先建立一個賬號專門運行Elasticsearch.

創建es組和賬戶，創建組命令groupadd 用戶組，創建用戶useradd -g 用戶組用戶名

$ groupadd es
$ useradd -g es es

設置密碼

$ passwd es

FX_zscs_0303

按照提示輸入密碼和確認密碼就成功創建elsearch賬戶了。

2、修改系統參數

使用命令vim /etc/security/limits.conf

在最後添加數據如下：(soft nproc和hard nproc也可以設置成65536)

root soft nofile 65535
root hard nofile 65535

#es
es soft nofile 65536
es hard nofile 65536

* soft nofile 65535
* hard nofile 65535

技術分享圖片

使用命令vim /etc/sysctl.conf在最後添加一行數據如下:vm.max_map_count=262144

修改完後，執行如下命令

sysctl -p

3、創建elasticsearch工作目錄

$ cd /data/
$ mkdir elasticsearch

4、下載elasticsearch

打開官網 https://www.elastic.co/cn/downloads

選擇下載elasticsearch，根據需要選擇對應的安裝包，這裏選擇5.5.3版本，下載完後得到 elasticsearch-5.5.3.tar.gz

5、安裝

將下載好的elasticsearch-5.5.3.tar.gz上傳到/data/elasticsearch目錄下

解壓elasticsearch-5.5.3.tar.gz 到當前目錄

$ tar -zxvf elasticsearch-5.5.3.tar.gz -C /data/elasticsearch

查看

$ ls
elasticsearch-5.5.3  elasticsearch-5.5.3.tar.gz

刪除壓縮文件，使用命令

$ rm -f elasticsearch-5.5.3.tar.gz

授權/data/elasticsearch文件給es用戶，命令說明chown [選項]... [所有者][:[組]] 文件...

chown -R es:es /data/elasticsearch

6、配置

進入/data/elasticsearch/elasticsearch-5.5.3目錄，使用命令

cd /data/elasticsearch/elasticsearch-5.5.3

目錄結構

├── elasticsearch-5.5.3
│   ├── bin
│   │   ├── elasticsearch
│   │   ├── elasticsearch.bat
│   │   ├── elasticsearch.in.bat
│   │   ├── elasticsearch.in.sh
│   │   ├── elasticsearch-keystore
│   │   ├── elasticsearch-keystore.bat
│   │   ├── elasticsearch-plugin
│   │   ├── elasticsearch-plugin.bat
│   │   ├── elasticsearch-service.bat
│   │   ├── elasticsearch-service-mgr.exe
│   │   ├── elasticsearch-service-x64.exe
│   │   ├── elasticsearch-service-x86.exe
│   │   ├── elasticsearch-systemd-pre-exec
│   │   ├── elasticsearch-translog
│   │   └── elasticsearch-translog.bat
│   ├── config
│   │   ├── elasticsearch.yml
│   │   ├── jvm.options
│   │   └── log4j2.properties
│   ├── lib
│   ├── LICENSE.txt
│   ├── modules
│   ├── NOTICE.txt
│   ├── plugins
│   └── README.textile

進入其中的config目錄(使用命令cd config)，編輯elasticsearch.yml文件

cd ./config
vim elasticsearch.yml

添加配置，註意，配置文件“:”後要有空格

#這是集群名字，我們 起名為 elasticsearch
#es啟動後會將具有相同集群名字的節點放到一個集群下。
cluster.name:  es-zscs

#節點名字。
node.name: "es-node1"

# 數據存儲位置(單個目錄設置) 
path.data: /data/elasticsearch/elasticsearch-5.5.3/data

# 日誌文件的路徑 
path.logs: /data/elasticsearch/elasticsearch-5.5.3/logs


#設置綁定的ip地址,可以是ipv4或ipv6的,默認為0.0.0.0 
#network.bind_host: 192.168.250.104

#設置其它節點和該節點交互的ip地址,如果不設置它會自動設置,值必須是個真實的ip地址
#network.publish_host: 192.168.250.104

#同時設置bind_host和publish_host上面兩個參數。
#network.host: 192.168.250.104


# 設置節點間交互的tcp端口,默認是9300 
transport.tcp.port: 9300 

# 設置是否壓縮tcp傳輸時的數據，默認為false,不壓縮
transport.tcp.compress: true 
 
# 設置對外服務的http端口,默認為9200 
http.port: 9200

# 使用http協議對外提供服務,默認為true,開啟 
#http.enabled: false 

#discovery.zen.ping.unicast.hosts:["節點1的 ip","節點2 的ip","節點3的ip"]
#這是一個集群中的主節點的初始列表,當節點(主節點或者數據節點)啟動時使用這個列表進行探測 
#discovery.zen.ping.unicast.hosts: ["192.168.137.100",  "192.168.137.101","192.168.137.100：9301"]

#指定集群中的節點中有幾個有master資格的節點。
#對於大集群可以寫(2-4)。
discovery.zen.minimum_master_nodes: 1

配置說明

屬性名	值	作用
cluster.name	elk	設置當前節點所屬的集群的名稱，為elasticsearch提供發現節點的作用
node.name	elk-es-01	設置當前節點的名稱
path.data	/data/elk/elasticsearch/data	設置當前節點的數據目錄
npath.logs	/data/elk/elasticsearch/logs	設置當前節點的日誌文件
network.host	0.0.0.0	設置允許訪問的服務器ip，0.0.0.0代表所有的服務器
http.port	9200	對外提供的服務端口
discovery.zen.ping.unicast.hosts	IP列表	用來發現新增的集群節點

7、引擎啟動

切換到es用戶

su es

啟動elasticsearch，(-d表示為後臺啟動)

/data/elasticsearch/elasticsearch-5.5.3/bin/elasticsearch -d

查看elasticsearch進程情況

$ ps -ef |grep elasticsearch
root     30076 25943  0 20:28 pts/0    00:00:00 grep --color=auto elasticsearch

或者使用

$ /data/elasticsearch/elasticsearch-5.5.3/bin/elasticsearch

打印結果，出現[es-node1] started表示啟動成功

[2017-09-27T09:39:11,080][INFO ][o.e.n.Node               ] [es-node1] initializing ...
[2017-09-27T09:39:11,172][INFO ][o.e.e.NodeEnvironment    ] [es-node1] using [1] data paths, mounts [[/data (/dev/vdb1)]], net usable_space [90.2gb], net total_space [98.3gb], spins? [possibly], types [ext3]
[2017-09-27T09:39:11,173][INFO ][o.e.e.NodeEnvironment    ] [es-node1] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-09-27T09:39:11,174][INFO ][o.e.n.Node               ] [es-node1] node name [es-node1], node ID [u5y2ra-qQL-q3IpdvfT4wA]
[2017-09-27T09:39:11,174][INFO ][o.e.n.Node               ] [es-node1] version[5.5.3], pid[9830], build[9305a5e/2017-09-07T15:56:59.599Z], OS[Linux/3.10.0-514.6.2.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_112/25.112-b15]
[2017-09-27T09:39:11,174][INFO ][o.e.n.Node               ] [es-node1] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/data/elasticsearch/elasticsearch-5.5.3]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [aggs-matrix-stats]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [ingest-common]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [lang-expression]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [lang-groovy]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [lang-mustache]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [lang-painless]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [parent-join]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [percolator]
[2017-09-27T09:39:12,109][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [reindex]
[2017-09-27T09:39:12,110][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [transport-netty3]
[2017-09-27T09:39:12,110][INFO ][o.e.p.PluginsService     ] [es-node1] loaded module [transport-netty4]
[2017-09-27T09:39:12,110][INFO ][o.e.p.PluginsService     ] [es-node1] loaded plugin [analysis-ik]
[2017-09-27T09:39:13,899][INFO ][o.e.d.DiscoveryModule    ] [es-node1] using discovery type [zen]
[2017-09-27T09:39:14,389][INFO ][o.e.n.Node               ] [es-node1] initialized
[2017-09-27T09:39:14,389][INFO ][o.e.n.Node               ] [es-node1] starting ...
[2017-09-27T09:39:14,520][INFO ][o.e.t.TransportService   ] [es-node1] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2017-09-27T09:39:17,579][INFO ][o.e.c.s.ClusterService   ] [es-node1] new_master {es-node1}{u5y2ra-qQL-q3IpdvfT4wA}{PALJMXYuQmeQ2ZDaGzAhfw}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-09-27T09:39:17,605][INFO ][o.e.h.n.Netty4HttpServerTransport] [es-node1] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
[2017-09-27T09:39:17,605][INFO ][o.e.n.Node               ] [es-node1] started
[2017-09-27T09:39:17,614][INFO ][o.e.g.GatewayService     ] [es-node1] recovered [0] indices into cluster_state

測試服務是否啟動，如果啟動成功則有如下提示

curl http://localhost:9200/?pretty

打出如下，說明沒問題

{
  "name" : "es-node1",
  "cluster_name" : "es-zscs",
  "cluster_uuid" : "DaViHV9TRaKL-AVobcjfAw",
  "version" : {
    "number" : "5.5.3",
    "build_hash" : "9305a5e",
    "build_date" : "2017-09-07T15:56:59.599Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.0"
  },
  "tagline" : "You Know, for Search"
}

8、錯誤問題

ERROR: [2] bootstrap checks failed
[1]: max file descriptors [65535] for elasticsearch process is too low, increase to at least [65536]
[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

解決辦法

以管理員的賬號登錄linux，修改系統參數

使用命令vim /etc/security/limits.conf在最後添加數據如下：(soft nproc和hard nproc也可以設置成65536)

root soft nofile 65535
root hard nofile 65535

#es
es soft nofile 65536
es hard nofile 65536

* soft nofile 65535
* hard nofile 65535

技術分享圖片

使用命令vim /etc/sysctl.conf在最後添加一行數據如下:vm.max_map_count=262144

修改完後，執行如下命令

sysctl -p

參考

http://blog.csdn.net/u012371450/article/details/51776505

三、Elasticsearch中安裝中文分詞器(IK+pinyin)

在安裝分詞器的時候要註意版本問題，分詞器的版本要和elasticsearch版本一致

1、安裝IK

IK，elasticsearch-analysis-ik提供了兩種方式,ik_smart就是最少切分，ik_max_word則為細粒度的切分（可能是雙向，沒看過源碼）

1.1 下載地址

ik分詞器下載地址

https://github.com/medcl/elasticsearch-analysis-ik

也可以下載對應的releases版本進行安裝

https://github.com/medcl/elasticsearch-analysis-ik/releases

1.2獲取ik分詞器插件包

ik分詞器插件可以直接下載對應版本的源碼進行maven打包，也可以直接下載打包好的文件進行安裝，下面介紹兩種方式

1.2.1 maven打包安裝

下載對應版本的ik源碼，這裏下載elasticsearch-analysis-ik-5.5.3.zip，下載地址https://github.com/medcl/elasticsearch-analysis-ik/releases

技術分享圖片

如果沒有對應的版本，只需要修改pom.xml就可以了

<properties>
        <!-- 這裏的版本號，修改成你對應的版本就行了。
        不過最好不要跨度太大，相近的版本可能沒有問題，但是跨度太大的版本，這樣做就不保證好使了-->
        <elasticsearch.version>5.5.3</elasticsearch.version>
        <maven.compiler.target>1.7</maven.compiler.target>
        <elasticsearch.assembly.descriptor>${project.basedir}/src/main/assemblies/plugin.xml</elasticsearch.assembly.descriptor>
        <elasticsearch.plugin.name>analysis-ik</elasticsearch.plugin.name>
        <elasticsearch.plugin.classname>org.elasticsearch.plugin.analysis.ik.AnalysisIkPlugin</elasticsearch.plugin.classname>
        <elasticsearch.plugin.jvm>true</elasticsearch.plugin.jvm>
        <tests.rest.load_packaged>false</tests.rest.load_packaged>
        <skip.unit.tests>true</skip.unit.tests>
        <gpg.keyname>4E899B30</gpg.keyname>
        <gpg.useagent>true</gpg.useagent> 
    </properties>

下載後，執行mvn package，進行打包

├─config
├─src
└─target
    ├─archive-tmp
    ├─classes
    ├─generated-sources
    ├─maven-archiver
    ├─maven-status
    ├─releases
    │  └─elasticsearch-analysis-ik-5.5.3.zip
    └─surefire

編譯完成後，可以在target/releases目錄下找到對應的zip包。

解壓elasticsearch-analysis-ik-5.5.3.zip包，復制到/data/elasticsearch/elasticsearch-5.5.3/plugins/analysis-ik下即可

1.2.2 直接下載ik分詞器插件包

下載地址https://github.com/medcl/elasticsearch-analysis-ik/releases

技術分享圖片

解壓elasticsearch-analysis-ik-5.5.3.zip包，復制到/data/elasticsearch/elasticsearch-5.5.3/plugins/analysis-ik下即可

1.3 安裝ik分詞器插件

將下載好的elasticsearch-analysis-ik-5.5.3.zip上傳到/data/elasticsearch目錄下

解壓到/data/elasticsearch/elasticsearch-5.5.3/plugins/analysis-ik

unzip -d ./ik ./elasticsearch-analysis-ik-5.5.3.zip
mv ./ik/elasticsearch/ /data/elasticsearch/elasticsearch-5.5.3/plugins/analysis-ik
rm -rf ik

這樣ik分詞器就安裝好了，重啟elasticsearch就可以使用分詞器了

1.4 測試

ik 帶有兩個分詞器

ik_max_word ：會將文本做最細粒度的拆分；盡可能多的拆分出詞語
ik_smart：會做最粗粒度的拆分；已被分出的詞語將不會再次被其它詞語占有

4.1測試ik_max_word分詞器

這裏使用curl進行測試

curl -XGET ‘http://localhost:9200/_analyze?pretty&analyzer=ik_max_word‘ -d ‘聯想是全球最大的筆記本廠商‘

返回結果

{
  "tokens" : [
    {
      "token" : "聯想",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "全球",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "最大",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "的",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_CHAR",
      "position" : 4
    },
    {
      "token" : "筆記本",
      "start_offset" : 8,
      "end_offset" : 11,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "筆記",
      "start_offset" : 8,
      "end_offset" : 10,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "本廠",
      "start_offset" : 10,
      "end_offset" : 12,
      "type" : "CN_WORD",
      "position" : 7
    },
    {
      "token" : "廠商",
      "start_offset" : 11,
      "end_offset" : 13,
      "type" : "CN_WORD",
      "position" : 8
    }
  ]
}

1.4.2 測試ik_smart分詞器

這裏使用curl進行測試

curl -XGET ‘http://localhost:9200/_analyze?pretty&analyzer=ik_smart‘ -d ‘聯想是全球最大的筆記本廠商‘

返回結果

{
  "tokens" : [
    {
      "token" : "聯想",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "全球",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "最大",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "的",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_CHAR",
      "position" : 4
    },
    {
      "token" : "筆記本",
      "start_offset" : 8,
      "end_offset" : 11,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "廠商",
      "start_offset" : 11,
      "end_offset" : 13,
      "type" : "CN_WORD",
      "position" : 6
    }
  ]
}

參考文檔：

http://blog.csdn.net/jam00/article/details/52983056
http://www.cnblogs.com/xing901022/p/5910139.html

1.5 熱詞更新配置

網絡詞語日新月異，如何讓新出的網絡熱詞（或特定的詞語）實時的更新到我們的搜索當中呢先用 ik 測試一下

curl -XGET ‘http://localhost:9200/_analyze?pretty&analyzer=ik_max_word‘ -d ‘成龍原名陳港生‘

返回結果

{
  "tokens" : [
    {
      "token" : "成龍",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "原名",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "陳",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "CN_CHAR",
      "position" : 2
    },
    {
      "token" : "港",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "CN_CHAR",
      "position" : 3
    },
    {
      "token" : "生",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "CN_CHAR",
      "position" : 4
    }
  ]
}

ik 的主詞典中沒有”陳港生” 這個詞，所以被拆分了。

現在我們來配置一下，修改 IK 的配置文件：ES 目錄/plugins/ik/config/ik/IKAnalyzer.cfg.xml

修改如下：

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">  
<properties>  
    <comment>IK Analyzer 擴展配置</comment>
    <!--用戶可以在這裏配置自己的擴展字典 -->    
    <entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic</entry>     
     <!--用戶可以在這裏配置自己的擴展停止詞字典-->
    <entry key="ext_stopwords">custom/ext_stopword.dic</entry>
    <!--用戶可以在這裏配置遠程擴展字典 --> 
    <entry key="remote_ext_dict">http://192.168.1.136/hotWords.php</entry>
    <!--用戶可以在這裏配置遠程擴展停止詞字典-->
    <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

這裏我是用的是遠程擴展字典，因為可以使用其他程序調用更新，且不用重啟ES，很方便；當然使用自定義的 mydict.dic 字典也是很方便的，一行一個詞，自己加就可以了。

既然是遠程詞典，那麽就要是一個可訪問的鏈接，可以是一個頁面，也可以是一個txt的文檔，但要保證輸出的內容是 utf-8 的格式。

hotWords.php 的內容

$s = <<<‘EOF‘
陳港生
元樓
藍瘦
EOF;
header(‘Last-Modified: ‘.gmdate(‘D, d M Y H:i:s‘, time()).‘ GMT‘, true, 200);
header(‘ETag: "5816f349-19"‘);
echo $s;

ik 接收兩個返回的頭部屬性 Last-Modified 和 ETag，只要其中一個有變化，就會觸發更新，ik 會每分鐘獲取一次，

重啟 Elasticsearch ，查看啟動記錄，看到了三個詞已被加載進來

[2016-10-31 15:08:57,749][INFO ][ik-analyzer              ] 陳港生
[2016-10-31 15:08:57,749][INFO ][ik-analyzer              ] 元樓
[2016-10-31 15:08:57,749][INFO ][ik-analyzer              ] 藍瘦

現在我們來測試一下，再次執行上面的請求，返回

...
  }, {
    "token" : "陳港生",
    "start_offset" : 5,
    "end_offset" : 8,
    "type" : "CN_WORD",
    "position" : 2
  }, {
...

可以看到 ik 分詞器已經匹配到了 “陳港生” 這個詞

參考文檔：

http://blog.csdn.net/jam00/article/details/52983056

2、pinyin 分詞器

2.1 安裝pinyin分詞器

pinyin分詞器可以讓用戶輸入拼音，就能查找到相關的關鍵詞。比如在某個商城搜索中，輸入 yonghui，就能匹配到永輝。這樣的體驗還是非常好的。

pinyin分詞器的安裝與IK是一樣的。下載地址：

https://github.com/medcl/elasticsearch-analysis-pinyin

對應的releases版本

https://github.com/medcl/elasticsearch-analysis-pinyin/releases

安裝方式和ik分詞器一樣，這裏就不介紹了。

安裝路徑是/data/elasticsearch/elasticsearch-5.5.3/plugins/analysis-pinyin

重啟elasticsearch生效

2.2 測試

測試地址

curl -XGET ‘http://localhost:9200/_analyze?pretty&analyzer=pinyin‘ -d ‘劉德華‘

返回結果

{
  "tokens" : [
    {
      "token" : "liu",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "ldh",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "de",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "hua",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "word",
      "position" : 2
    }
  ]
}

3、其它分詞器

參考文檔：

http://www.54tianzhisheng.cn/2017/09/07/Elasticsearch-analyzers/

四、工具安裝

1、Sense安裝使用

對於不熟悉Linux的人來講，使用curl是個硬傷，所以 Chrome有個插件Sense可以幫我們很方便的操作Elasticsearch。國內需要FQ

技術分享圖片

先來測試下分詞技術分享圖片

在Linux上安裝Elasticsearch5.x

直接 master ted 管理日誌采集 generated uuid unit security 這裏使用elasticsearch做全文檢索，不是ELK日誌采集。 elasticsearch作為全文檢索，必須服務端和客服端的版本一致，所以在安裝elasticsearc

在Linux上安裝Elasticsearch5.x

前言

一、JAVA環境配置

二、Elasticsearch引擎安裝

1、創建帳號和分配權限

2、修改系統參數

3、創建elasticsearch工作目錄

4、下載elasticsearch

5、安裝

6、配置

7、引擎啟動

8、錯誤問題

三、Elasticsearch中安裝中文分詞器(IK+pinyin)

1、安裝IK

1.1 下載地址

1.2獲取ik分詞器插件包

1.2.1 maven打包安裝

1.2.2 直接下載ik分詞器插件包

1.3 安裝ik分詞器插件

1.4 測試

4.1測試ik_max_word分詞器

1.4.2 測試ik_smart分詞器

1.5 熱詞更新配置

2、pinyin 分詞器

2.1 安裝pinyin分詞器

2.2 測試

3、其它分詞器

四、工具安裝

1、Sense安裝使用

相關推薦