你還不會ES的CUD嗎？

阿新 • • 發佈：2020-09-27

近端時間在搬磚過程中對es進行了操作，但是對es查詢文件不熟悉，所以這兩週都在研究es，簡略看了《Elasticsearch權威指南》，摸摸魚又是一天。

es是一款基於Lucene的實時分散式搜尋和分析引擎，今天咱不聊其應用場景，聊一下es索引增刪改。

環境：Centos 7，Elasticsearch6.8.3，jdk8

（最新的es是7版本，7版本需要jdk11以上，所以裝了es6.8.3版本。）

下面都將以student索引為例

一、建立索引

PUT   http://192.168.197.100:9200/student
{
    "mapping":{
      "_doc":{ //“_doc”是型別type，es6中一個索引下只有一個type，不能有其它type 

        "properties":{
          "id": {
              "type": "keyword"
          },
          "name":{
            "type":"text",
            "index":"analyzed",
            "analyzer":"standard"
          },
          "age":{
            "type":"integer",
            "fields": {
              "keyword": {
                 
"type": "keyword",
                "ignore_above":256
              }
            }
          },
          "birthday":{
            "type":"date"
          },
          "gender":{
            "type":"keyword"
          },
          "grade":{
            "type":"text",
            "fields":{
               
"keyword":{
                "type":"keyword",
                 "ignore_above":256
              }
            }
          },
          "class":{
            "type":"text",
            "fields":{
              "keyword":{
                "type":"keyword",
                 "ignore_above":256
              }
            }
          }
        }
      }
    },
    "settings":{
      //主分片數量
      "number_of_shards" : 1, 
      //分片副本數量
      "number_of_replicas" : 1
    }
}

type屬性是text和keyword的區別：

（1）text在查詢的時候會被分詞，用於搜尋

（2）keyword在查詢的時候不會被分詞，用於聚合

index屬性是表示字串以何種方式被索引，有三種值

（1）analyzed：欄位可以被模糊匹配，類似於sql中的like

（2）not_analyzed：欄位只能精確匹配，類似於sql中的“=”

（3）no：欄位不提供搜尋

analyzer屬性是設定分詞器，中文的話一般是ik分詞器，也可以自定義分詞器。

number_of_shards屬性是主分片數量，預設是5，建立之後不能修改

number_of_replicas屬性時分片副本數量，預設是1，可以修改

建立成功之後會返回如下json字串

{    "acknowledged": true,    "shards_acknowledged": true,    "index": "student"}

建立之後如何檢視索引的詳細資訊呢？

GET http://192.168.197.100:9200/student/_mapping

es6版本，索引之下只能有一個型別，例如上文中的“_doc”。

es跟關係型資料庫比較：

二、修改索引

//修改分片副本數量為2
PUT http://192.168.197.100:9200/student/_settings
{
  "number_of_replicas":2
}

三、刪除索引

//刪除單個索引 
DELETE http://192.168.197.100:9200/student

//刪除所有索引
DELETE  http://192.168.197.100:9200/_all

四、預設分詞器standard和ik分詞器比較

es預設的分詞器是standard，它對英文的分詞是以空格分割的，中文則是將一個詞分成一個一個的文字，所以其不適合作為中文分詞器。

例如：standard對英文的分詞

//此api是檢視文字分詞情況的 
POST http://192.168.197.100:9200/_analyze
{
  "text":"the People's Republic of China",
  "analyzer":"standard"
}

結果如下：

{
    "tokens": [
        {
            "token": "the",
            "start_offset": 0,
            "end_offset": 3,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "people's",
            "start_offset": 4,
            "end_offset": 12,
            "type": "<ALPHANUM>",
            "position": 1
        },
        {
            "token": "republic",
            "start_offset": 13,
            "end_offset": 21,
            "type": "<ALPHANUM>",
            "position": 2
        },
        {
            "token": "of",
            "start_offset": 22,
            "end_offset": 24,
            "type": "<ALPHANUM>",
            "position": 3
        },
        {
            "token": "china",
            "start_offset": 25,
            "end_offset": 30,
            "type": "<ALPHANUM>",
            "position": 4
        }
    ]
}

對中文的分詞：

POST http://192.168.197.100:9200/_analyze
{
  "text":"中華人民共和國萬歲",
  "analyzer":"standard"
}

結果如下：

{
    "tokens": [
        {
            "token": "中",
            "start_offset": 0,
            "end_offset": 1,
            "type": "<IDEOGRAPHIC>",
            "position": 0
        },
        {
            "token": "華",
            "start_offset": 1,
            "end_offset": 2,
            "type": "<IDEOGRAPHIC>",
            "position": 1
        },
        {
            "token": "人",
            "start_offset": 2,
            "end_offset": 3,
            "type": "<IDEOGRAPHIC>",
            "position": 2
        },
        {
            "token": "民",
            "start_offset": 3,
            "end_offset": 4,
            "type": "<IDEOGRAPHIC>",
            "position": 3
        },
        {
            "token": "共",
            "start_offset": 4,
            "end_offset": 5,
            "type": "<IDEOGRAPHIC>",
            "position": 4
        },
        {
            "token": "和",
            "start_offset": 5,
            "end_offset": 6,
            "type": "<IDEOGRAPHIC>",
            "position": 5
        },
        {
            "token": "國",
            "start_offset": 6,
            "end_offset": 7,
            "type": "<IDEOGRAPHIC>",
            "position": 6
        },
        {
            "token": "萬",
            "start_offset": 7,
            "end_offset": 8,
            "type": "<IDEOGRAPHIC>",
            "position": 7
        },
        {
            "token": "歲",
            "start_offset": 8,
            "end_offset": 9,
            "type": "<IDEOGRAPHIC>",
            "position": 8
        }
    ]
}

ik分詞器是支援對中文進行詞語分割的，其有兩個分詞器，分別是ik_smart和ik_max_word。

（1）ik_smart：對中文進行最大粒度的劃分，簡略劃分

例如：

POST http://192.168.197.100:9200/_analyze
{
  "text":"中華人民共和國萬歲",
  "analyzer":"ik_smart"
}

結果如下：

{
    "tokens": [
        {
            "token": "中華人民共和國",
            "start_offset": 0,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 0
        },
        {
            "token": "萬歲",
            "start_offset": 7,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 1
        }
    ]
}

（2）ik_max_word：對中文進行最小粒度的劃分，將文字劃分儘量多的詞語

例如：

POST http://192.168.197.100:9200/_analyze
{
  "text":"中華人民共和國萬歲",
  "analyzer":"ik_max_word"
}

結果如下：

{
    "tokens": [
        {
            "token": "中華人民共和國",
            "start_offset": 0,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 0
        },
        {
            "token": "中華人民",
            "start_offset": 0,
            "end_offset": 4,
            "type": "CN_WORD",
            "position": 1
        },
        {
            "token": "中華",
            "start_offset": 0,
            "end_offset": 2,
            "type": "CN_WORD",
            "position": 2
        },
        {
            "token": "華人",
            "start_offset": 1,
            "end_offset": 3,
            "type": "CN_WORD",
            "position": 3
        },
        {
            "token": "人民共和國",
            "start_offset": 2,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 4
        },
        {
            "token": "人民",
            "start_offset": 2,
            "end_offset": 4,
            "type": "CN_WORD",
            "position": 5
        },
        {
            "token": "共和國",
            "start_offset": 4,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 6
        },
        {
            "token": "共和",
            "start_offset": 4,
            "end_offset": 6,
            "type": "CN_WORD",
            "position": 7
        },
        {
            "token": "國",
            "start_offset": 6,
            "end_offset": 7,
            "type": "CN_CHAR",
            "position": 8
        },
        {
            "token": "萬歲",
            "start_offset": 7,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 9
        },
        {
            "token": "萬",
            "start_offset": 7,
            "end_offset": 8,
            "type": "TYPE_CNUM",
            "position": 10
        },
        {
            "token": "歲",
            "start_offset": 8,
            "end_offset": 9,
            "type": "COUNT",
            "position": 11
        }
    ]
}

ik分詞器對英文的分詞：

POST http://192.168.197.100:9200/_analyze
{
  "text":"the People's Republic of China",
  "analyzer":"ik_smart"
}

結果如下：會將不重要的詞去掉，但standard分詞器會保留（英語水平已經退化到a an the都不知道是屬於什麼型別的詞了，身為中國人，這個不能驕傲）

{
    "tokens": [
        {
            "token": "people",
            "start_offset": 4,
            "end_offset": 10,
            "type": "ENGLISH",
            "position": 0
        },
        {
            "token": "s",
            "start_offset": 11,
            "end_offset": 12,
            "type": "ENGLISH",
            "position": 1
        },
        {
            "token": "republic",
            "start_offset": 13,
            "end_offset": 21,
            "type": "ENGLISH",
            "position": 2
        },
        {
            "token": "china",
            "start_offset": 25,
            "end_offset": 30,
            "type": "ENGLISH",
            "position": 3
        }
    ]
}

五、新增文件

可以任意新增欄位

//1是“_id”的值，唯一的，也可以隨機生成
POST http://192.168.197.100:9200/student/_doc/1
{
  "id":1,
  "name":"tom",
  "age":20,
  "gender":"male",
  "grade":"7",
  "class":"1"
}

六、更新文件

POST http://192.168.197.100:9200/student/_doc/1/_update
{
  "doc":{
    "name":"jack"
  }
}

七、刪除文件

//1是“_id”的值 
DELETE http://192.168.197.100:9200/student/_doc/1

上述就是簡略的對es進行索引建立，修改，刪除，文件新增，刪除，修改等操作，為避免篇幅太長，文件查詢操作將在下篇進行更新。

java都13了， 8的新特性你還不會用嗎

前言 java13都已經來了，很多同學還停留在使用java5的東西。如果在日常開發中沒有使用上java8的一些新特性或者不會用。這篇文章對你可能有幫助。

你還不會ES的CUD嗎？

近端時間在搬磚過程中對es進行了操作，但是對es查詢文件不熟悉，所以這兩週都在研究es，簡略看了《Elasticsearch權威指南》，摸摸魚又是一天。

scrapy爬蟲框架你還不會嗎？簡單使用爬蟲框架採集網站資料

前言本文的文字及圖片過濾網路，可以學習，交流使用，不具有任何商業用途，如有問題請及時聯絡我們以作處理。

2019年Java面試題基礎系列228道（1），快看看哪些你還不會？

Java面試題（一） 1、面向物件的特徵有哪些方面？ 2、訪問修飾符 public,private,protected,以及不寫（預設）時的區別？

2019年Java併發精選面試題，哪些你還不會？（含答案和思維導圖）

Java 併發程式設計 1、併發程式設計三要素？ 2、實現可見性的方法有哪些？ 3、多執行緒的價值？

2019年Java面試題基礎系列228道（5），快看看哪些你還不會？

2019年Java面試題基礎系列228道 Java面試題（一）第一篇更新1~20題的答案解析 juejin.im/post/5de8c6…

2019年Java面試題基礎系列228道（4），快看看哪些你還不會？

2019年Java面試題基礎系列228道第一篇更新1~20題的答案解析 juejin.im/post/5de8c6… 第二篇更新21~50題答案解析

程式設計師，你還不會合理選擇Filter、Interceptor、Aspect？

享學課堂特邀作者：老顧轉載請宣告出處！前言小夥伴們應該聽說過過濾器、攔截器、切面，印象上都能夠起到截斷攔截的作用，在做一些業務需求時，不知道如何選擇，今天老顧就來介紹一下他們之間的區別。

什麼？你還不會通過純js提交表單？

如果程式已經封裝好了，不管後臺是java 、asp。net 、還是php ？這個時候你的客戶突然追加說我要追加表單驗證？ what 婦產科怎麼辦？

怕你還不會Python函式，我特意為你整理了一篇部落格

什麼是函式函式是組織好的，可重複使用的，用來實現單一，或相關聯功能的程式碼段。

【Spring註解驅動開發】你還不會使用@Resource和@Inject註解？那你就out了！！

寫在前面我在冰河技術微信公眾號中發表的《【Spring註解驅動開發】使用@Autowired@Qualifier@Primary三大註解自動裝配元件，你會了嗎？》一文中，介紹瞭如何使用@Autowired、@Qualifier和@Primary註解自動裝配Sp

你說啥什麼?註解你還不會？

註解文末有彩蛋。一、什麼是註解？ Annotaion 註解（Annotaion）是從JDK5.0開始引入的一種新技術稱之為註解機制。

正則表示式工作利器你還不會？

技術標籤：前端正則表示式 https://github.com/ziishaned/learn-regex/blob/master/translations/README-cn.md https://regex101.com/

聽說你還不會UAF？

前言這道題目反應了uaf的基本原理，pwn入門必做的題目，如果這一塊瞭解的不夠透徹，直接去打現在涉及各種奇技淫巧的pwn，肯定會被繞暈掉。所以為了照顧萌新（其實是我自己菜）把這道題目單獨拿出來寫一下。

你還不瞭解SpringSecurity嗎？快來看看SpringSecurity實戰總結~

SpringSecurity簡介：許可權管理中的相關概念主體 principal：使用系統的使用者或裝置或從其他系統遠端登入的使用者等等，簡單說就是誰使用系統誰就是主體。

你還不會ElasticsSearch分頁查詢？那你看這一篇就夠了，快拿走！

關注【大資料之美】公眾號和你一起成長回覆【999】入群點選連結：加入我們，一起動手實踐真實大資料專案

還不會使用JWT格式化OAuth2令牌嗎？

OAuth2預設的AccessToken是由DefaultAccessTokenConverter生成，是具有唯一性的UUID隨機字串，我們如果想要使用JWT來格式化AccessToken就需要使用JwtAccessTokenConverter來進行格式化，當然如果你有自己獨特的業務可

? 看了我的這編分散式事務還不會的帶你白嫖

前言文字已收錄至我的GitHub倉庫，歡迎Star：github.com/bin39232820…種一棵樹最好的時間是十年前，其次是現在

Spring boot隨機埠你都不會還怎麼動態擴容

一般情況下每個spring boot工程啟動都有固定的埠，但是固定埠不利用服務的動態擴容，如果在一臺伺服器上需要對同一個服務進行多例項部署，很容易出現埠衝突，那麼怎麼解決這個問題呢？

還不會浮點數轉二進位制？下次有人問你，直接把這篇文章扔給他

作為一名程式猿，假如某一天，有一個妹子拿著一個浮點數，求你教她怎麼換算成二進位制，如果你不能單手求出來，你都不能算一個合格的工具人.....好吧，是一個合格的程式猿(狗頭保命)。

你還不會ES的CUD嗎？

相關推薦