es中的term和match的區別
阿新 • • 發佈:2020-11-24
term用法
先看看term的定義,term是代表完全匹配,也就是精確查詢,搜尋前不會再對搜尋詞進行分詞拆解。
這裡通過例子來說明,先存放一些資料:
{ "title": "love China", "content": "people very love China", "tags": ["China", "love"] } { "title": "love HuBei", "content": "people very love HuBei", "tags": ["HuBei", "love"] }
來使用term
查詢下:
{ "query": { "term": { "title": "love" } } }
結果是,上面的兩條資料都能查詢到:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.6931472, "hits": [ { "_index": "test", "_type": "doc", "_id": "8", "_score": 0.6931472, "_source": { "title": "love HuBei", "content": "people very love HuBei", "tags": ["HuBei","love"] } }, { "_index": "test", "_type": "doc", "_id": "7", "_score": 0.6931472, "_source": { "title": "love China", "content": "people very love China", "tags": ["China","love"] } } ] } }
發現,title裡有關love的關鍵字都查出來了,但是我只想精確匹配love China
這個,按照下面的寫法看看能不能查出來:
{ "query": { "term": { "title": "love China" } } }
執行發現無資料,從概念上看,term屬於精確匹配,只能查單個詞。我想用term匹配多個詞怎麼做?可以使用terms
來:
{ "query": { "terms": { "title": ["love", "China"] } } }
查詢結果為:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.6931472, "hits": [ { "_index": "test", "_type": "doc", "_id": "8", "_score": 0.6931472, "_source": { "title": "love HuBei", "content": "people very love HuBei", "tags": ["HuBei","love"] } }, { "_index": "test", "_type": "doc", "_id": "7", "_score": 0.6931472, "_source": { "title": "love China", "content": "people very love China", "tags": ["China","love"] } } ] } }
發現全部查詢出來,為什麼?因為terms裡的[ ]
多個是或者的關係,只要滿足其中一個詞就可以。想要通知滿足兩個詞的話,就得使用bool的must來做,如下:
{ "query": { "bool": { "must": [ { "term": { "title": "love" } }, { "term": { "title": "china" } } ] } } }可以看到,我們上面使用
china
是小寫的。當使用的是大寫的China
我們進行搜尋的時候,發現搜不到任何資訊。這是為什麼了?title這個詞在進行儲存的時候,進行了分詞處理。我們這裡使用的是預設的分詞處理器進行了分詞處理。我們可以看看如何進行分詞處理的?
分詞處理器
GET test/_analyze { "text" : "love China" }
結果為:
{ "tokens": [ { "token": "love", "start_offset": 0, "end_offset": 4, "type": "<ALPHANUM>", "position": 0 }, { "token": "china", "start_offset": 5, "end_offset": 10, "type": "<ALPHANUM>", "position": 1 } ] }
分析出來的為love
和china
的兩個詞。而term
只能完完整整的匹配上面的詞,不做任何改變的匹配。所以,我們使用China
這樣的方式進行的查詢的時候,就會失敗。稍後會有一節專門講解分詞器。
match用法
先用love China
來匹配。
GET test/doc/_search { "query": { "match": { "title": "love China" } } }
結果是:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 1.3862944, "hits": [ { "_index": "test", "_type": "doc", "_id": "7", "_score": 1.3862944, "_source": { "title": "love China", "content": "people very love China", "tags": [ "China", "love" ] } }, { "_index": "test", "_type": "doc", "_id": "8", "_score": 0.6931472, "_source": { "title": "love HuBei", "content": "people very love HuBei", "tags": [ "HuBei", "love" ] } } ] } }發現兩個都查出來了,為什麼?因為match進行搜尋的時候,會先進行分詞拆分,拆完後,再來匹配,上面兩個內容,他們title的詞條為:
love china hubei
,我們搜尋的為love China
我們進行分詞處理得到為love china
,並且屬於或的關係,只要任何一個詞條在裡面就能匹配到。如果想 love
和 China
同時匹配到的話,怎麼做?使用 match_phrase
match_phrase
用法
match_phrase
稱為短語搜尋,要求所有的分詞必須同時出現在文件中,同時位置必須緊鄰一致。
GET test/doc/_search { "query": { "match_phrase": { "title": "love china" } } }
結果為:
{ "took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1.3862944, "hits": [ { "_index": "test", "_type": "doc", "_id": "7", "_score": 1.3862944, "_source": { "title": "love China", "content": "people very love China", "tags": [ "China", "love" ] } } ] } }
這次好像符合我們的需求了,結果只出現了一條記錄。
原文連結:https://www.jianshu.com/p/d5583dff4157